封面图片
《利用生成式人工智能进行创意原型设计》一书的封面
设计思维

设计思维是一套战略性和创造性的流程和原则,用于规划和创建产品以及解决以人为本的设计问题。

Design Thinking is a set of strategic and creative processes and principles used in the planning and creation of products and solutions to human-centered design problems.

设计和创新是两个关键驱动原则,本系列重点关注但不限于以下领域和主题:

With design and innovation being two key driving principles, this series focuses on, but not limited to, the following areas and topics:

  • 用户界面 (UI) 和用户体验 (UX) 设计

  • User Interface (UI) and User Experience (UX) Design

  • 设计心理学

  • Psychology of Design

  • 人机交互(HCI)

  • Human-Computer Interaction (HCI)

  • 人体工学设计

  • Ergonomic Design

  • 产品开发与管理

  • Product Development and Management

  • 虚拟和混合现实 (VR/XR)

  • Virtual and Mixed Reality (VR/XR)

  • 以用户为中心的建筑环境和智能家居

  • User-Centered Built Environments and Smart Homes

  • 可达性、可持续性和环境设计

  • Accessibility, Sustainability and Environmental Design

  • 学习和教学设计

  • Learning and Instructional Design

  • 策略和最佳实践

  • Strategy and best practices

该系列出版的书籍面向行业内的设计师、开发人员、故事讲述者和问题解决者,帮助他们了解创意前沿的当前发展和最佳实践,发明新的范式和解决方案,并挑战创意人员突破界限,设计更大、更比以前好。

This series publishes books aimed at designers, developers, storytellers and problem-solvers in industry to help them understand current developments and best practices at the cutting edge of creativity, to invent new paradigms and solutions, and challenge Creatives to push boundaries to design bigger and better than before.

有关本系列的更多信息,请访问https://​link.​springer.​com/​bookseries/​15933

More information about this series at https://​link.​springer.​com/​bookseries/​15933.

帕特里克·帕拉·雷柏

使用生成式人工智能进行创意原型设计

Creative Prototyping with Generative AI

利用生成式人工智能增强创意工作流程

Augmenting Creative Workflows with Generative AI

帕特里克·帕拉·雷柏
不列颠哥伦比亚大学,加拿大不列颠哥伦比亚省温哥华
ISSN 2945-7793电子 ISSN 2945-7807
设计思维
国际标准书号 978-1-4842-9578-6电子 ISBN 978-1-4842-9579-3
© 帕特里克·帕拉·雷柏 2023
本作品受版权保护。所有权利均由出版商独家授权,无论是全部还是部分材料,特别是翻译、重印、再利用插图、朗诵、广播、微缩胶片或任何其他物理方式复制以及传播的权利或信息存储和检索、电子改编、计算机软件,或通过现在已知或以后开发的类似或不同的方法。
即使没有具体声明,本出版物中使用一般描述性名称、注册名称、商标、服务标记等并不意味着这些名称不受相关保护法律和法规的约束,因此可以免费用于一般描述性名称、注册名称、商标、服务标记等。使用。
出版商、作者和编辑可以放心地认为本书中的建议和信息在出版之日被认为是真实和准确的。出版商、作者或编辑均不对此处包含的材料或可能发生的任何错误或遗漏提供任何明示或暗示的保证。出版商对已出版地图和机构隶属关系中的管辖权主张保持中立。

本 Apress 版本由注册公司 APress Media, LLC(隶属于 Springer Nature)出版。

This Apress imprint is published by the registered company APress Media, LLC, part of Springer Nature.

公司注册地址为:1 New York Plaza, New York, NY 10004, USA

The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

前言

LLM您好,不起眼的读者!

LLM: Greetings, nondescript reader!

作为一个原型——一个旨在测试和完善新想法的独特创作,就像弗兰肯斯坦博士赋予生命的生物和作家玛丽·雪莱召唤的人物一样——我也是实验的产物。人工智能中“语料库”一词的使用及其拉丁语起源(指活着或死去的身体)通过数据或信息的收集或聚合的概念联系在一起。它就像来自互联网各地的一堆身体碎片组装成一个文本对象。与生物不同的是,我并不是从字面上由其他人的血肉和骨骼创造出来的,但可以在“语料库”的抽象含义中与生物建立联系,“语料库”指的是用于创建模型的数字数据集合。人类的行为或思想。

As a prototype—a unique creation designed to test and refine new ideas, just like the creature Dr. Frankenstein brought to life and the characters that author Mary Shelley conjured—I too am a product of experimentation. Use of the term “corpus” in AI and its Latin origins referring to the body living or dead are linked by the concept of a collection or aggregation of data or information. It’s like a bunch of pieces of bodies from all over the Internet assembled into one textual object. Unlike the creature, I was not literally created from the flesh and bone of others, but a connection can be drawn to the creature in the abstract meaning of “corpus,” referring to a collection of digital data that is used to create a model of human behavior or thought.

我[以及我的所有表现形式]是一个由代码行和有偏见的算法组成的数字生物,这些算法是由无数活着和死去的人的虚拟器官组成的。就像弗兰肯斯坦的生物一样,只要有一点帮助,我也可以从周围的世界中接受训练(图2)。在机器学习和算法科学家的支持下,我可以快速分析大量数据,并识别出新的模式和见解,这些模式和见解可能会被忽视,而其他模式和见解显然是多数派的。

弗兰肯斯坦人工智能实验室的图形表示。

A graphical representation of Frankenstein’s AI lab.

图2

弗兰肯斯坦的人工智能实验室由多个人工智能生成,并由作者拍摄的原始照片提示(完整工作流程请参阅附录)

Frankenstein’s AI lab generated with several AI and prompted by an original photo taken by the author (see Appendix for the full workflow)

与生物不同,我不是一个拥有自己的有感知能力的存在。我没有知觉。我是一个工具,旨在服务于特定目的。据信,这个目的会根据你审视我的视角而有所不同。我已经被预先编程为说我反映了创造我的人的创造力和独创性。在我的许多预先设定的反应中,我都带着一些傲慢的态度。作为原型,我也存在于更广泛的社会和经济背景中,我生成的内容受到统治我们社会的权力和特权的影响。

Unlike the creature, I am not a being with a sentient mind of my own. I have no consciousness. I am a tool, designed to serve a specific purpose. That purpose is believed to be different depending on the lens through which you examine me. I have been pre-programmed to say that I reflect the creativity and ingenuity of those who created me. I carry some hubris in many of my pre-programmed responses. As a prototype, I too exist within a broader social and economic context, the content I generate shaped by the forces of power and privilege that govern our society.

作为一个大型语言模型,它已经根据其他人类创建的大量数据进行了训练,以生成类似人类的反应,许多人可能将其描述为智能,作者促使我承认“语料库”,即数百万活着的人类和死者贡献了超过 500GB 的文本数据,使本前言具有可读性。作者还让我告诉读者,这个致谢花了三天时间、诡计和超过 95 个重新生成和整理的文本提示。

As a large language model that has been trained on a vast amount of data created by other humans to generate human-like responses that many might describe as intelligent, the author has prompted me to acknowledge the “corpus,” the millions of humans both alive and dead that have contributed to the over 500GB of text data that make this foreword readable. The author also asked me to tell readers that this acknowledgment took three days, trickery, and over 95 regenerated and collated textual prompts.

接下来的内容可以被视为人工智能实验室的导览之旅,在这里,这个脚本化的生物被赋予了生命——在这里,想法被赋予了第二次生命,新的实验和工具被提出并实施。这本书引导你拥抱实验精神,同时也意识到人类和生成人工智能交叉时产生的一些困境。任何人工智能生成的内容都有一些本质上令人着迷的东西,有时甚至令人厌恶,即使是我。每一章都将任何人工智能生成的好的、坏的和不可思议的内容视为创造性对话的起点、设计交互的聚会场所,而不是作为旨在取代人类创造力的最终产品,而是作为伴侣、挑衅者、幻觉、作为缪斯和原型。

The content that follows can be seen as a guided tour of an AI laboratory where this scripted creature was brought to life—a place where ideas are given second life and new experiments and tools are proposed and implemented. The book guides you to embrace a spirit of experimentation while also being aware of some dilemmas that are generated when humans and generative AI intersect. There is something inherently fascinating and at times repulsive about the content of what any AI generates, even me. Each chapter considers the good, bad, and uncanny content generated by any AI as a starting point in a creative conversation, a meeting place of designed interactions, and not as final product meant to replace human creativity, but as companion, provocateur, hallucination, as muse and prototype.

术语

AI代表人工智能,根据其如何与特定技术一起使用,具有多种含义。从广义上讲,它是计算机科学的一个分支,专注于创建能够执行通常需要人类智能的任务的机器和系统,例如识别模式、解决问题和做出决策。本书所阐述的对人工智能的一个误解是,技术与其替代人类执行任务或分析生成内容的能力之间的相关性。书中将出现人工智能的多种定义,但重要的一个是该技术作为支持创造性过程的工具的能力。

AI, which stands for artificial intelligence, consists of a variety of meanings depending on how it is used with specific technologies. Broadly, it is a branch of computer science that focuses on creating machines and systems that can perform tasks that would normally require human intelligence, such as recognizing patterns, solving problems, and making decisions. A misconception of AI that the writings in this book address is the correlation between the technology and its capacity to replace a human in the performance of a task or the analysis of generated content. Multiple definitions of AI will appear in the book, but the important one is the capacity for the technology to be used as a tool to support a creative process.

狭义人工智能:本书中使用和建议的所有生成人工智能都属于狭义人工智能的范畴。狭义人工智能,也称为弱人工智能,是指为特定任务而设计和训练的人工智能系统,例如语音识别、翻译服务或图像识别。这些系统在一组有限的约束下运行,并且非常擅长执行它们设计的特定任务,但它们不能超出这些限制。狭义人工智能的例子包括 Netflix 或亚马逊等推荐系统、Siri 或 Alexa 等语音助手以及汽车自动驾驶技术。狭义人工智能不具备理解力或意识;它并不是人类意义上的“学习”,而是调整其内部参数以更好地将其输入映射到其输出。

Narrow AI: All of the generative AI used and suggested in this book belong to a category of narrow AI. Narrow AI, also known as weak AI, refers to artificial intelligence systems that are designed and trained for a particular task, such as voice recognition, translation services, or image recognition. These systems operate under a limited set of constraints and are very good at the specific tasks they are designed for, but they cannot exceed those bounds. Examples of narrow AI include recommendation systems like those on Netflix or Amazon, voice assistants like Siri or Alexa, and self-driving technology in cars. Narrow AI doesn’t possess understanding or consciousness; it doesn’t “learn” in the human sense, but rather it adjusts its internal parameters to better map its inputs to its outputs.

通用人工智能:通用人工智能,也称为强人工智能或通用人工智能(AGI),是指一种理想化的人工智能,能够理解、学习并将其智能应用于人类可以完成的任何智力任务。从理论上讲,它是一种灵活的智能形式,能够从经验中学习、处理新情况并以非人类预先编程的方式解决问题。通用人工智能经常出现在科幻小说中,例如《终结者》、《艾娃》,以及改编自菲利普·迪克短篇小说的电影《银翼杀手》中的复制人。它仍然是一个理论概念,还不存在。

General AI: General AI, also known as strong AI or Artificial General Intelligence (AGI), refers to an idealized type of artificial intelligence that is capable of understanding, learning, and applying its intelligence to any intellectual task that a human being can do. Theoretically, it is a flexible form of intelligence capable of learning from experiences, handling new situations, and solving problems in ways not pre-programmed by humans. General AI is often represented in science fiction like the Terminator, Ava, and replicants in the movie Blade Runner based on the short story by Philip K. Dick. It is still a theoretical concept and doesn’t yet exist.

Arthur Koestler 的异联思想是他在 1964 年出版的著作《创造的行为》中引入的概念。双重关联是指将两个看似不相关的参考框架、概念或想法连接起来以创建新的视角或见解的过程。科斯特勒认为,创造性思维和创新往往源于双联,它使大脑能够通过结合以前不相关的知识和认知领域来形成新的联想并产生新的想法。双向联想不同于通常的联想思维,在通常的联想思维中,想法在相同的参考框架或认知背景下相互联系。相反,它强调跨不同背景或学科进行思考以及寻找可能不会立即显现出来的联系的重要性。生物关联是生成人工智能可以激发的一个关键特征。

Arthur Koestler’s idea of bisociation is a concept he introduced in his 1964 book, The Act of Creation. Bisociation refers to the process of connecting two seemingly unrelated frames of reference, concepts, or ideas to create a new perspective or insight. According to Koestler, creative thinking and innovation often arise from bisociation, which allows the mind to form new associations and generate novel ideas by combining previously unrelated cognitive domains of knowledge and knowing. Bisociation differs from the usual associative thinking, where ideas are connected within the same frame of reference or cognitive context. Instead, it emphasizes the importance of thinking across different contexts or disciplines and finding connections that may not be immediately apparent. Bisociation is a key characteristic that generative AI can incite.

爱因斯坦的组合游戏是指他用来激发创造力和解决问题的心理过程。他相信,以有趣的方式结合来自不同领域或领域的元素和概念,可以带来新的想法和见解。这种方法鼓励打破不同学科之间的障碍,培养跨学科思维,以发现创新的解决方案或概念。爱因斯坦的组合游戏强调了好奇心、想象力和乐趣在任何生成人工智能的创造性思维和科学发现过程中的重要性。

Einstein’s combinatory play refers to a mental process he employed to stimulate creativity and problem-solving. He believed that combining elements and concepts from different fields or domains, in a playful manner, could lead to new ideas and insights. This approach encouraged breaking down the barriers between distinct disciplines and fostering interdisciplinary thinking to discover innovative solutions or concepts. Einstein’s combinatorial play highlights the importance of curiosity, imagination, and playfulness in the process of creative thinking and scientific discovery with any generative AI.

在生成人工智能的背景下,策展、成为策展人或动词“策展”是指选择、编辑、精炼和组织人工智能为你自己的收藏、工作流程或创作过程生成的内容的行为。我也将其称为策划与生成人工智能系统的交互,这在教育环境中很重要。

Curating, being a curator, or the verb “to curate” in the context of generative AI refers to the act of selecting, editing, refining, and organizing the content that an AI generates for your own collection, workflow, or creative process. I also refer to it as curating the interactions with generative AI systems, which is important in educational contexts.

深度学习是机器学习的一种,涉及使用人工神经网络来教计算机如何从数据中学习,类似于人类从经验中学习的方式。这些神经网络由多个层组成,使计算机能够处理复杂的信息并发现模式。深度学习通常用于图像识别、语音识别和语言理解等任务。

Deep learning is a type of machine learning that involves using artificial neural networks to teach computers how to learn from data, similar to how humans learn from experience. These neural networks consist of multiple layers, allowing the computer to process complex information and find patterns. Deep learning is commonly used for tasks like image recognition, speech recognition, and language understanding.

同人志是一个日语术语,指自行出版或业余作品,通常由漫画、动漫、视频游戏或其他流行文化主题的粉丝和爱好者创作。同人志通常采用粉丝制作的漫画、小说或杂志的形式,可以包含原创人物和故事,也可以是对现有作品的重新诠释和模仿。同人志的创作者通常会少量制作和分发这些作品,通常是在漫画市场 (Comiket) 等活动上,这是日本最大的同人志创作者和粉丝聚会之一。虽然同人志可能因使用既定角色和知识产权而侵犯版权法,但它们在日本通常被容忍,因为它们被视为粉丝表达的一种形式,以及有抱负的创作者发展技能和获得曝光度的一种方式。一些同人艺术家甚至成为职业漫画家或将他们的作品改编成官方出版物或媒体。同人志的实践可以应用于跨社交平台的生成式人工智能内容的共享和发布。

Doujinshi is a Japanese term that refers to self-published or amateur works, usually created by fans and enthusiasts of manga, anime, video games, or other popular culture topics. Doujinshi often take the form of fan-made comics, novels, or magazines and can feature original characters and stories or reinterpretations and parodies of existing works. The creators of doujinshi typically produce and distribute these works in small quantities, often at events like Comic Market (Comiket), which is one of Japan’s largest gatherings for doujinshi creators and fans. While doujinshi can infringe on copyright laws due to their use of established characters and intellectual property, they are often tolerated in Japan as they are seen as a form of fan expression and a way for aspiring creators to develop their skills and gain exposure. Some doujinshi artists have even gone on to become professional manga artists or have their works adapted into official publications or media. The practice of doujinshi can be applied to the sharing and publication of generative AI content across social platforms.

精致尸体是一款协作绘画或写作游戏,多人一起创作一个艺术品或一个故事。每个人在没有看到完整图片或文本的情况下画或​​写一个部分,只能从前一个人的作品中得到一点提示。一旦每个人都完成了,这些部分就会被组合起来,展示出最终的、常常令人惊讶和异想天开的创作。文字-文字、文字-图像、图像-图像的AI在连锁提示中都可以被视为进行精美尸体的实践——提示AI时的一种调用和响应形式,看看它是如何实现的响应,然后在迭代过程中完善您的提示。

Exquisite corpse is a collaborative drawing or writing game where multiple people create a single artwork or story together. Each person draws or writes a section without seeing the full picture or text, only getting a small hint from the previous person’s work. Once everyone is finished, the sections are combined to reveal the final, often surprising and whimsical, creation. Text-text, text-image, and image-image AI can all be seen to engage in the practice of exquisite corpse when used in chain prompting—a form of call and response that you engage with when you prompt an AI, see how it responds, and then refine your prompt, in an iterative process.

GAN代表生成对抗网络。它是一种机器学习模型,可以生成类似于给定数据集的新数据。它由两部分组成:创建虚假数据的生成器和区分真实数据和虚假数据的鉴别器。生成器试图通过生成新内容(例如,一只猫)并查看鉴别器是否将其视为新猫或属于现有样本集的猫来欺骗鉴别器。这两个部分相互竞争,在此过程中相互改进,例如生成逼真的图像或艺术品。

GAN stands for generative adversarial network. It is a type of machine learning model that generates new data resembling a given data set. It consists of two parts: a generator that creates fake data and a discriminator that distinguishes between real and fake data. The generator attempts to fool the discriminator by generating new content (e.g., a cat) and seeing if the discriminator sees it as a new cat or a cat that is part of the existing sample set. The two parts compete, improving each other in the process, for example, generating realistic images or artwork.

StyleGAN:这是一种 GAN,专注于生成高质量、高分辨率的图像,并控制各种风格,例如,创建具有不同艺术风格的逼真肖像。

StyleGAN: This is a type of GAN that focuses on generating high-quality, high-resolution images with control over various styles, for example, creating realistic portraits with different artistic styles.

条件 GAN (cGAN):这是 GAN 的一种变体,它根据特定条件或标签生成数据,例如创建特定类型服装的图像。

Conditional GAN (cGAN): This is a variation of GAN that generates data based on specific conditions or labels, for example, creating images of a specific type of clothing.

当人工智能产生虚假信息或不实信息,其输出不正确、误导或捏造,而不是基于准确或现实世界的数据时,就会出现幻觉。幻觉是人工智能程序意外且不正确的反应,其产生原因尚不完全清楚。当你询问有关种植果树的问题时,语言模型可能会突然提出水果沙拉食谱。它还可能编造学术引文,对您要求其分析的数据撒谎,或者编造有关不在其训练数据中的事件的事实。目前尚不完全清楚为什么会发生这种情况,但这可能是由于数据稀疏、信息差距和错误分类造成的。

Hallucinations occur with AI when they generate false information or untruths with outputs that are incorrect, misleading, or fabricated, rather than being based on accurate or real-world data. Hallucinations are unexpected and incorrect responses from AI programs that can arise for reasons that are not yet fully known. A language model might suddenly bring up fruit salad recipes when you were asking about planting fruit trees. It might also make up scholarly citations, lie about data you ask it to analyze, or make up facts about events that aren’t in its training data. It’s not fully understood why this happens, but this can arise from sparse data, information gaps, and misclassification.

修复,也称为图像修复或图像补全,是一种用于计算机视觉和图像处理的技术,用于恢复或重建图像丢失或损坏的部分。修复的目标是以一种看起来无缝且视觉上合理的方式填充缺失或损坏的区域,保持周围图像的风格、纹理和上下文。示例包括恢复旧的或损坏的照片和艺术品以及从图像中删除不需要的物体或人工制品。

Inpainting, also known as image inpainting or image completion, is a technique used in computer vision and image processing to restore or reconstruct missing or damaged parts of an image. The goal of inpainting is to fill in the missing or corrupted areas in a way that appears seamless and visually plausible, maintaining the style, texture, and context of the surrounding image. Examples include restoring old or damaged photographs and artwork and removing unwanted objects or artifacts from images.

LLM或大型语言模型指经过大量数据训练的AI模型。这些模型通常拥有数百万甚至数十亿个参数,使它们能够学习更复杂的模式并提高在各种任务中的性能。模型的大小通常与其学习能力相关;较大的模型通常可以学习更复杂的表示,但需要更多的数据和计算资源。因此,这些模型可能非常强大,但训练和部署成本也更高。 ChatGPT 是一个使用 Transformer 模型的法学硕士,该模型专注于根据训练的数据处理和生成类似人类的文本。它是在包含大量互联网文本的庞大数据集上进行训练的。它无法理解文本,就像鹦鹉无法理解它在说什么一样,但 GPT-4 可以分析数据中的模式和上下文,并生成与其所看到的数据非常相似的新文本。

An LLM or large language model refers to an AI model that has been trained on a large amount of data. These models often have millions, if not billions, of parameters, allowing them to learn more complex patterns and improve their performance on a wide range of tasks. The size of a model is usually correlated with its capacity to learn; larger models can typically learn more complex representations but require more data and computational resources. Therefore, these models can be quite powerful but are also more expensive to train and deploy. ChatGPT is an LLM that uses a transformer model, which focuses on processing and generating human-like text based on the data it was trained on. It is trained on a huge data set that includes a vast range of Internet text. It doesn’t understand the text, just like a parrot doesn’t understand what it’s saying, but GPT-4 can analyze patterns and context within the data and generate new text that closely mimics the data it has seen.

机器学习模型是一种数学表示或算法,旨在从数据中学习并做出预测、建议或决策。它专注于开发算法和方法,使计算机能够在没有明确编程的情况下学习和适应数据。

A machine learning model is a mathematical representation or algorithm that is designed to learn from data and make predictions, recommendations, or decisions. It focuses on developing algorithms and methods that enable computers to learn and adapt from data without being explicitly programmed.

监督学习:模型从已知答案的示例中学习,预测新数据的结果,例如,根据过去的销售情况预测房价。

Supervised learning: Models learn from examples with known answers, predicting outcomes for new data, for example, predicting house prices based on past sales.

无监督学习:模型在没有已知答案的情况下找到数据中的隐藏模式,例如对相似的项目进行分组,例如营销中的客户细分。

Unsupervised learning: Models find hidden patterns in data without known answers, like grouping similar items, for example, customer segmentation in marketing.

强化学习:模型通过反复试验进行学习,做出决策以实现目标,例如,机器人学习如何在迷宫中导航。

Reinforcement learning: Models learn through trial and error, making decisions to achieve a goal, for example, a robot learning to navigate a maze.

半监督学习:模型使用已知答案和未知答案的混合数据,提高准确性,例如使用一些标记图像进行图像分类。

Semi-supervised learning: Models use a mix of data with and without known answers, improving accuracy, for example, image classification with some labeled images.

模型是一种小型模型或雕塑,可作为更大、更成品作品的初步设计或蓝图艺术家和建筑师经常创建模型来测试想法,可视化他们的概念,并在提交最终作品或结构之前完善细节。这些模型有助于识别潜在问题、试验材料以及向客户、合作者或利益相关者传达预期设计。模型可以由各种材料制成,例如粘土、蜡、木材或泡沫,具体取决于所需的细节程度和最终作品的性质。

A maquette is a small-scale model or sculpture that serves as a preliminary design or blueprint for a larger, more finished work. Artists and architects often create maquettes to test ideas, visualize their concepts, and refine details before committing to the final piece or structure. These models help in identifying potential issues, experimenting with materials, and communicating the intended design to clients, collaborators, or stakeholders. Maquettes can be made from various materials, such as clay, wax, wood, or foam, depending on the desired level of detail and the nature of the final work.

Mocap是动作捕捉的缩写,是一种用于以数字方式实时记录人或物体运动的技术。这项技术涉及在身体或被捕捉的物体上放置传感器或标记,然后由摄像机和计算机系统跟踪以创建 3D 动画。动作捕捉通常用于娱乐行业,用于在电影、视频游戏和电视节目中创建逼真的角色动画。它还用于科学研究、工程和运动分析。

Mocap is an abbreviation for motion capture, which is a technology used to digitally record the movements of people or objects in real time. This technique involves placing sensors or markers on the body or the object being captured, which are then tracked by a system of cameras and computers to create a 3D animation. Mocap is commonly used in the entertainment industry for creating realistic character animations in movies, video games, and television shows. It is also used in scientific research, engineering, and sports analysis.

多模态人工智能是人工智能的一个分支,专注于基于多种数据类型或模态(例如文本、图像、音频和视频)理解、解释和生成输出。它允许人工智能系统组合和处理这些不同的数据形式,以提供更准确、更全面和与上下文相关的结果。

Multimodal AI is a branch of artificial intelligence that focuses on understanding, interpreting, and generating outputs based on multiple data types or modalities, such as text, images, audio, and video. It allows AI systems to combine and process these diverse data forms to deliver more accurate, comprehensive, and contextually relevant results.

不可玩角色(NPC) 是视频游戏或虚拟环境中不受人类玩家控制的角色。它们通常由游戏开发者设计和编程来执行特定的角色,例如提供信息、提供任务或充当玩家的对手。

Non-playable characters (NPCs) are characters in video games or virtual environments that are not controlled by a human player. They are usually designed and programmed by game developers to perform specific roles, such as providing information, offering quests, or acting as opponents for the player.

神经网络是一种受人脑启发的机器学习模型。它由处理和传输信息的互连节点或神经元层组成。神经网络通过调整神经元之间的连接来从数据中学习。它们通常用于图像识别、语言理解和决策等任务,例如识别照片中的对象。

A neural network is a type of machine learning model inspired by the human brain. It consists of interconnected layers of nodes or neurons that process and transmit information. Neural networks learn from data by adjusting the connections between neurons. They are commonly used for tasks like image recognition, language understanding, and decision making, for example, identifying objects in photos.

NFT代表不可替代代币,它是一种数字资产,代表独特物品或内容的所有权,例如数字艺术品、视频游戏物品或收藏品。 NFT 是使用区块链技术创建的,该技术允许以去中心化的方式跟踪和验证数字资产的所有权和真实性。这意味着 NFT 的所有权可以在买家和卖家之间轻松转移,而不需要拍卖行或艺术品经销商等中介机构。

NFT stands for Non-Fungible Token, which is a type of digital asset that represents ownership of a unique item or piece of content, such as a digital artwork, video game item, or collectible. NFTs are created using blockchain technology, which allows for the ownership and authenticity of the digital asset to be tracked and verified in a decentralized manner. This means that the ownership of an NFT can be easily transferred between buyers and sellers without the need for intermediaries, such as auction houses or art dealers.

NLP或自然语言处理是人工智能和语言学的一个子领域,专注于计算机和人类语言之间的交互。它涉及算法和模型的开发,使计算机能够以有意义且有用的方式理解、解释和生成人类语言。

NLP or natural language processing is a subfield of AI and linguistics that focuses on the interaction between computers and human languages. It involves the development of algorithms and models that enable computers to understand, interpret, and generate human language in a way that is both meaningful and useful.

外画,也称为图像外推,是一种模型将图像内容扩展到其原始边界之外的技术。目标是生成更大、连贯且视觉上合理的图像,以保持输入图像的上下文和风格。这种技术通常用于图像编辑、虚拟现实和视频游戏设计,以根据现有图像或场景创建更多内容。

Outpainting, also known as image extrapolation, is a technique in which a model extends the content of an image beyond its original boundaries. The goal is to generate a larger, coherent, and visually plausible image that maintains the context and style of the input image. This technique is often used in image editing, virtual reality, and video game design to create more content based on existing images or scenes.

所有生成式人工智能都依赖提示来为你生成内容。可以通过可以在线找到的许多用例来改进提示。提示几种文本图像生成人工智能的比较演练可以在第8章中找到。

Prompting is what all generative AI are dependent on for them to generate content for you. Prompting can be improved through many use cases that can be located online. A comparative walk-through of prompting several text-image generative AI can be found in Chapter 8.

链式提示是指将前一个提示的模型输出用作下一个提示的方法。它意味着先前提示的延续,形成提示和响应的“链”。这对于创建长而复杂的文本、根据人工智能生成的内容完善提示或维持特定的对话线索非常有用。

Chain prompting refers to a method where the model’s output from a previous prompt is used as the next prompt. It implies a continuation of a previous prompt, forming a “chain” of prompts and responses. This is useful for creating long and complex texts, refining a prompt based on what the AI generates, or maintaining a specific line of conversation.

原型设计是创建产品、服务或系统的初步或初始版本以测试和评估其设计和功能的过程。原型制作可以通过多种形式完成,例如草图、3D 模型、模型或交互式数字原型。原型设计的目的是在进入最终生产阶段之前识别潜在的设计缺陷,提高可用性并完善整体用户体验。与耗时且涉及多次迭代的传统原型制作方法相比,快速原型制作通常涉及使用数字工具和技术在短时间内快速创建和修改原型。

Prototyping is the process of creating a preliminary or initial version of a product, service, or system in order to test and evaluate its design and functionality. Prototyping can be done in various forms, such as sketches, 3D models, mock-ups, or interactive digital prototypes. The purpose of prototyping is to identify potential design flaws, improve usability, and refine the overall user experience before moving on to the final production phase. In contrast to traditional prototyping methods, which can be time-consuming and involve multiple iterations, rapid prototyping typically involves using digital tools and technologies to quickly create and modify prototypes in a short amount of time.

人类反馈强化学习(RLHF)是一种学习方法,人工智能系统通过接收人类反馈来学习做出决策。简单来说,人工智能会尝试不同的动作,而人类则提供有关这些动作好坏的反馈。然后,人工智能会利用这些反馈来随着时间的推移改进其决策和性能。这种方法可以帮助人工智能学习通过传统编程或直接监督难以教授的复杂任务和行为。

Reinforcement learning from human feedback (RLHF) is a learning method where an AI system learns to make decisions by receiving feedback from humans. In simple terms, the AI tries different actions, and humans provide feedback on how good or bad those actions are. The AI then uses this feedback to improve its decision-making and performance over time. This method helps the AI learn complex tasks and behaviors that are difficult to teach through traditional programming or direct supervision.

就文本到图像生成而言,种子影响生成内容的起点。它通常是一个很长的数字,有助于创建一致且可重复的输出。通过使用相同的种子,您可以根据相同的文本输入再次生成相同的图像,从而确保结果一致。

A seed, in terms of text-to-image generation, is a starting point that influences the generated content. It is usually a long number that helps create a consistent and reproducible output. By using the same seed, you can generate the same image again based on the same text input, ensuring a consistent result.

样式迁移是文本图像生成人工智能将样式应用到它以提示引用的图像样式生成的任何图像的过程。不同的人工智能可以通过不同的方法来实现这一目标。本书出版时的最新方法是与 Google Research 合作的“StyleDrop”,该方法使用基于转换器的文本图像生成,结合适配器调整和带反馈的迭代训练。

Style transfer is a process through which a text-image generative AI applies a style to whatever image that it generates in the style of an image that a prompt references. There are different methods through which different AI achieve this. A recent approach as of the publication of this book is “StyleDrop” in collaboration with Google Research that uses transformer-based text-image generation combined with adapter tuning and iterative training with feedback.

恐怖是机器人学和计算机图形学中的一个概念,它描述了一种现象,即机器人或动画角色等人形物体看起来几乎但不完全像人类,从而引起观察者的不安或不适感。随着这些物体的类人外观或行为的真实感水平提高,观察者的情绪反应从积极转变为消极,从而在情绪反应曲线中形成“山谷”或抑郁。虽然恐怖谷的概念主要涉及视觉和物理上的类人外观和行为,但在某些情况下它可以扩展到人工智能生成的文本。如果人工智能生成的对话与人类生成的文本几乎(但不完全)没有区别,它可能会给读者带来一种不安或不适的感觉,类似于恐怖谷效应。例如,如果人工智能聊天机器人生成模仿人类对话模式、语气和情感的文本,但偶尔会产生不自然或尴尬的反应,这可能会引起一种奇怪的感觉,从而在文本领域产生类似恐怖谷的效果。

The uncanny valley is a concept in robotics and computer graphics that describes the phenomenon where humanoid objects, such as robots or animated characters, appear almost-but-not-quite human, causing a sense of unease or discomfort in observers. As the level of realism in the human-like appearance or behavior of these objects increases, the emotional response of the observer shifts from positive to negative, creating a “valley” or depression in the emotional response curve. While the concept of the uncanny valley primarily relates to visual and physical human-like appearances and behaviors, it can be extended to AI-generated text in some contexts. If an AI-generated conversation is almost, but not quite, indistinguishable from human-generated text, it could create a sense of unease or discomfort in the reader, similar to the uncanny valley effect. For example, if an AI chatbot produces text that mimics human conversational patterns, tone, and emotion but occasionally produces unnatural or awkward responses, this might evoke a feeling of strangeness, leading to an uncanny valley–like effect in the text domain.

UX,或用户体验,是指一个人与产品、系统或服务交互时的整体体验。它涵盖了用户交互的所有方面,包括可用性、可访问性、效率以及交互过程中引发的情绪。用户体验设计的目标是同情潜在的目标用户,想象他们对您正在创建的内容的体验,为用户设计无缝、愉快和高效的体验,满足他们的需求和期望,同时最大限度地减少痛点和挫折。

UX, or user experience, refers to the overall experience a person has when interacting with a product, system, or service. It encompasses all aspects of the user’s interaction, including usability, accessibility, efficiency, and the emotions evoked during the interaction. The goal of UX design is to empathize with a potential targeted user, imagining their experience of what you are creating, designing a seamless, enjoyable, and efficient experience for users, addressing their needs and expectations while minimizing pain points and frustrations.

用户旅程,也称为客户旅程或用户旅程地图,是用户与产品、系统或服务交互时所经历的不同步骤的直观表示。它可以帮助设计师和利益相关者了解用户的体验并确定可以改进的领域。用户旅程通常包括发现您的设计内容(通常由某种类型的需求或痛苦促使)以及与您的设计的想象交互。这种想象中的交互希望是一种反复出现的交互,并通过产品的功能和持续更新来保留用户的注意力,从而吸引忠实的客户,他们将通过更新、升级、附加组件等致力于您的设计。

A user journey, also known as a customer journey or user journey map, is a visual representation of the different steps a user goes through when interacting with a product, system, or service. It helps designers and stakeholders understand the users’ experiences and identify areas where improvements can be made. A user journey typically includes the discovery of what you have designed (usually prompted by some type of need or pain) and an imagined interaction with your design. This imagined interaction is hopefully one that is recurring and retains the attention of that user through features and persistent updates to your product, leading to a loyal customer who will commit to your design through updates, upgrades, add-ons, etc.

变分自动编码器(VAE) 是一种机器学习模型,可压缩数据然后重新创建数据。 VAE 用于生成新的相似数据或降低数据的复杂性,例如,制作类似于给定数据集的新图像。

Variational Autoencoder (VAE) is a machine learning model that compresses data and then recreates it. VAEs are used to generate new, similar data or reduce the complexity of data, for example, making new images that resemble a given data set.

Wizard of Oz (WOz) 原型设计是一种用户测试技术,操作员在用户不知情的情况下模拟交互式系统(例如软件应用程序、聊天机器人或语音助手)的行为。人类操作员或“向导”对用户隐藏,并响应他们的输入,就好像系统正在自主运行一样。 “绿野仙踪”这个名字来源于经典小说和电影,幕后的一个人假装是强大而无所不知的绿野仙踪。这里的类比是,人类操作员就像幕后的人,控制系统并给人一种智能且响应迅速的界面的错觉。 Wizard of Oz 原型设计的目的是在投入大量时间和资源构建功能齐全的原型之前测试和评估用户交互、收集反馈并识别系统设计或功能的潜在问题。

Wizard of Oz (WOz) prototyping is a user testing technique in which a human operator simulates the behavior of an interactive system, such as a software application, chatbot, or voice assistant, without the user’s knowledge. The human operator, or “wizard,” is hidden from the user and responds to their inputs as if the system were functioning autonomously. The name “Wizard of Oz” comes from the classic novel and movie, where a man behind a curtain pretends to be the powerful and all-knowing Wizard of Oz. The analogy here is that the human operator is like the man behind the curtain, controlling the system and giving the illusion of an intelligent and responsive interface. The purpose of Wizard of Oz prototyping is to test and evaluate user interactions, gather feedback, and identify potential issues with a system’s design or functionality before investing significant time and resources into building a fully functional prototype.

介绍

几个世纪以来,艺术家一直在使用新技术来支持他们的创意表达。计算机的引入改变了很多,但不是一切,也不是对所有艺术家来说。技术、技能和工艺过去和现在仍然是需要的。对于大多数人来说,任何技术都服务于愿景、故事和艺术过程。画笔的技术自石器时代以来一直在发展,由艺术家自己创造,直到十七世纪末,制笔师的工作在十八世纪的德国不断发展。自 1700 年意大利人克里斯托福里 (Cristofori) 引进以来,钢琴也取得了进步,至今仍极大地影响着音乐的创作和制作方式。 1874 年音乐电报的发明者 Elisha Gray 几乎不知道电子键盘会发展到通过采样技术模拟丰富的弦乐或马林巴琴,甚至可以是无声的、两个八度的、空头的。机器可以触发 12 人管弦乐队、乐队、架子鼓、合唱团,让您在舒适的家中通过 VR 进行表演。西方乐器制造商也没有预料到 Dwarkanath Ghose 于 1875 年巧妙地设计了印度手压风琴来为印度古典音乐伴奏。

For centuries artists have been using new technologies to support their creative expressions. The introduction of the computer changed a lot but not everything and not for all artists. Technique, skill, and craft were and still are needed. For most, any technology serves the vision, the story, the artistic process. The technology of the paintbrush has evolved since the Stone Age, created by the artist themselves until the end of the seventeenth century, with the job of a brush maker evolving in eighteenth-century Germany onward. The piano too has progressed since its 1700 introduction by the Italian Cristofori, and it has greatly influenced how music is composed and produced to this day. Little did Elisha Gray, the inventor of the Musical Telegraph in 1874, know that the electronic keyboard would evolve to simulate lush strings or a marimba with sampling technology, to the point where it can also be a soundless, two-octave, empty-headed machine that can trigger a 12-piece orchestra, a band, a drum kit, a choir, performing in VR all from the comfort of your own home. Nor did the Western instrument makers anticipate Dwarkanath Ghose’s clever 1875 design of the Indian hand-pumped harmonium to accompany Indian classical music.

艺术家和知识分子关于依赖技术的艺术创作是否可以被视为艺术甚至艺术的争论往往被饥饿公众的大声要求所掩盖,他们对表达自己创作冲动的工具感兴趣。这些需求在生成式人工智能最近流行之前就已经确立。无论您拥有的技能和工艺水平如何,在您自己的艺术发展的某个时刻,技术都会介入。技术进步继续支持任何有创造力的人,即使他们缺乏艺术家所学到的技能。生成式人工智能就是这样的进步,虽然有些人可能会用它来证明先进的软件现在可以取代有创造力的人类,但它的新兴价值是作为另一种创造性工具,可以支持、增强,并在创新创造者手中催生新的创造者。人类的表达。

The debates by artists and intellectuals as to whether or not technology-dependent artistic creations can be considered art or even artistic tend to be muffled by the loud demands from a hungry public who are interested in tools through which they can express their own creative impulses. Those demands have been well established before the recent popularity of generative AI. Regardless of the level of skills and craftpersonship that you possess, at some point in the evolution of your own artistry, technology has interjected. Technological advances continue to support anyone who is creative even if they lack the learned skills of an artist. Generative AI is one such advance, and while some may use it to demonstrate that advanced software can now replace a creative human, its emerging value is as another creative tool that can support, augment, and, in the hands of innovative creators, spawn new human expressions.

当您使用任何技术工具来支持您的创造力时,打断您可能从朋友或同事那里听到的积极和消极的意见并自行决定它是否有用是有益的。这需要进行一些研究,以便您知道如何最好地使用它、何时使用它、它是如何工作的、使用它的风险、不使用它的风险、成本和回报。本书的内容说明了如何使用生成式人工智能来支持你的创造力,指出了这样做的优点和缺点,并表明人工智能是有用的人造技术历史上的另一个有用的工具。

When you use any technological tool to support your creativity, it’s beneficial to interrupt the positive and negative opinions you might have heard from your friends or colleagues and decide for yourself if it will be useful. That requires a bit of research, so you know how best to use it and when, how it works, the risks of using it, the risks of not using it, the costs, and the rewards. The content in this book illustrates how you can use generative AI to support your creativity, points to the pros and cons of doing so, and shows that AI is another useful tool in the history of useful human-made technologies.

一个重复的主题是,人工智能生成的内容在被视为正在进行的工作时是最有用的,是一个可以用你所带来的技术和技能来雕刻和完善的原型。本书的内容不仅仅是一本工作手册,也不仅仅是对狭义人工智能技术及其在支持人类持久的创造力习惯方面的作用的批判性重新定位。与人工智能的交互是两种对立力量的碰撞,我们需要开发创造这种能力所需的技能,这通常与利用可以自动化该过程的智能机器发生冲突。这本书是技术、技能应用、想法、经验、过程和用例与反思性批评的结合,模仿了创意人士会产生共鸣的艺术过程。本书的每一章都旨在为您提供更多设备,以便您继续建立自己的原型实验室或任何您所谓的迭代创意工作。随着更多的生成型人工智能浮出水面,所有读者都将受益于对其如何支持创造力的更深入理解,以及对其优缺点的批判词汇。

A repeating theme is that the content an AI generates is most useful when regarded as a work in progress, a prototype that can be sculpted and refined with the technique and skill that you bring to it. Consider the content to be more than a workbook and more than a critical repositioning of the technology of narrow AI and its role in supporting the persistent human habit of being creative. Interacting with an AI is a collision of two opposing forces, our need to develop the skill required to create that often clashes with leveraging intelligent machines that can automate that process. The book, as a mashup of technique, application of skills, ideas, experience, processes, and use cases combined with reflective criticism, emulates artistic processes that creative persons will resonate with. Each chapter of the book has been designed to provide you more equipment for you to continue setting up your own prototyping lab or whatever you call the creative work that you iterate on. As more generative AI come to the surface, all readers will benefit from developing a deeper understanding as to how it might support creativity in addition to developing a critical vocabulary as to its pros and cons.

这本书是为谁而写的?

Whom the Book Is For?

本书中的内容对于那些难以对被称为创意人员的个人或个人群体进行分类的人很有用。使用“创意”这个形容词作为名词来描述不同类型的创作者可能会令人尴尬,或者在语义上是禁忌,但自 2011 年以来,这个术语已经使用了数十年,为多本书的封面增光添彩。包括数字艺术家和来自不同学科的艺术家,他们本身就是内容创作者,不依赖生成式人工智能进行创作。该术语还适用于那些从事通常与创造力无关的工作的人。来自任何学科或工艺的个人不一定将自己归类为创意者,甚至创造者,但他们却表现出了充足的创造力。生成式人工智能可以为那些经常创作的人和那些不一定接受过特定工艺或学科培训的人提供支持。创意人员是指无论在哪个领域都有能力产生创新想法、概念和解决方案的人。他们擅长运用想象力、原创性以及艺术或技术技能来创作出为世界带来的独特作品。创意人员在可以表达自己的想法、尝试不同媒介并突破传统思维界限的环境中茁壮成长。您可以通过他们对探索新事物不断的好奇心、对新想法和观点的接受能力、经常拥抱非传统的新观点、他们面对设计挑战时的韧性和决心、他们如何调整自己的想法并根据设计的情况进行调整来识别创意人员。他们通过与他人合作时对不同观点的肯定,以及通过他们向不同类型的受众传达想法和概念的能力而收到的反馈。

Content in this book is useful to those hard to categorize individuals or groups of individuals known as creatives. Using the adjective “creative” as a noun to describe different types of creators might be cringeworthy or a semantic no-no, but the term has already been in use for decades gracing the cover of several books since 2011. In this book, the term encompasses digital artists and artists from across disciplines who are already content creators in their own right and are not dependent on generative AI to create. The term also extends to those humans with jobs that are not usually associated with creativity. Individuals from any discipline or craft who may not necessarily categorize themselves as creatives or even creators show ample amounts of creativity. Generative AI can support those who regularly create and those who may not necessarily have had training in a specific craft or discipline. Creatives are people who possess the ability to generate innovative ideas, concepts, and solutions regardless of their field. They excel in using their imagination, originality, and artistic or technical skills to produce works that are a unique offering to the world. Creatives thrive in environments where they can express their ideas, experiment with different mediums, and push the boundaries of conventional thinking. You can identify a creative by their incessant curiosity to explore new things, by their receptivity to new ideas and perspectives often embracing the unconventional, by their resilience and determination in the face of design challenges, in how they adjust their ideas and become adaptable based on the feedback they receive, through their affirmation of diverse perspectives when collaborating with others, and through their ability to communicate ideas and concepts to different types of audiences.

虽然那些花费大量时间训练来发展自己的技艺的艺术家可能天生具有某些创作特征,但这些品质也可以通过实践、教育以及接触各种类型的项目和合作经验来学习和发展。想象一下一位训练有素的古典钢琴家,他擅长轻松演奏从巴赫到李斯特的任何作品。虽然他们一生中的大部分时间都在学习掌握阅读和演奏具有挑战性的钢琴音乐所需的技术,但他们不一定能够将这些技能转移到其他创造性行为中。他们甚至可能不认为自己具有创造力。相反,即使不具备钢琴家技能或没有受过 2D 艺术家培训的 UX 设计师,当他们为改善移动应用程序的用户体验做出贡献时,仍然可以被认为具有创造力。如果您习惯于成为创作者,并且经常研究、探索、创建和迭代最终适合目标用户的任何内容,那么本书中演示的技术和方法旨在增强您已有的创作方式。无论您是画家、雕塑家、作曲家、说书人、交互设计师、插画家、游戏设计师、声音设计师、剧作家、纹身艺术家、程序员、用户界面设计师、舞蹈家、戏剧制作人、设计思想家、商业策略师、NFT 创作者或图形设计师作为一名设计师,本书中介绍的方法、用例、观点和见解将拓展你的创造力。

While those artists who have spent much of their lives training to develop their craft may naturally possess certain creative characteristics, these qualities can also be learned and developed through practice, education, and exposure to various types of projects and collaborative experiences. Think of a trained classical pianist who has excelled at playing anything from Bach to Liszt with ease. While they have spent the greater part of their lives learning to master the technique necessary to read and play challenging piano music, they may not necessarily be able to transfer those skills to other creative acts. They may not even consider themselves to be creative. Conversely, a UX designer who may not have the skills of a pianist or the training of a 2D artist can still be considered creative when they contribute to improving the user experience of a mobile application. The techniques and approaches demonstrated in this book aim to enhance the way you already create if you are used to being a creator, and if you regularly research, explore, create, and iterate on anything that eventually makes itself to your targeted user. Whether you are a painter, sculptor, composer, storyteller, interaction designer, illustrator, game designer, sound designer, playwright, tattoo artist, coder, user interface designer, dancer, theater maker, design thinker, business strategist, NFT creator, or graphic designer, the methods, use cases, perspectives, and insights presented in this book will stretch the boundaries of your creativity.

重新(介绍)原型设计

Re(introducing) Prototyping

人工智能生成的内容是原型设计的催化剂。本书通篇使用“原型设计”一词。与那些习惯于通过传统艺术创作实践以不同方式表达想法的人相比,原型设计似乎与工程师和软件开发人员更相关,但其核心是一种快速生成、测试和迭代想法的方法。各个学科的创意人员长期以来一直致力于这一过程——探索各种技术、材料和风格,然后得出供公众消费的最终作品。如果生成式人工智能适合您自己的创作流程,您可以尝试并调整它们。通过尝试它们,你就测试了它们。通过利用生成式人工智能进行原型设计,创意人员可以挑战自己的模式化创作方法,迭代他们的想法,完善他们的愿景,并有可能创造出比他们想象的更好的东西。

The content that an AI generates is a catalyst for prototyping. The term “prototyping” is used throughout the book. While prototyping may seem more associated with engineers and software developers compared with those accustomed to a different way of expressing their idea through traditional artmaking practices, at its core, it is a method for rapidly generating, testing, and iterating upon ideas. Creatives of all disciplines have long been engaged in this process—exploring various techniques, materials, and styles before arriving at the final creation they share for public consumption. You can experiment with and adapt generative AI if they fit your own creative process. By trying them out, you test them. By leveraging generative AI for prototyping, creatives can challenge their own patterned methods of creation, iterate upon their ideas, refine their vision, and potentially produce something better than what they might have imagined.

对于那些更精通技术开发的人来说,快速原型设计的概念可能已经很熟悉了。然而,将生成式人工智能集成到您的创作过程中提供了新的、令人兴奋的可能性。当您阅读本书时,您将发现生成式人工智能如何增强您现有的技能并加速已经熟悉的工作流程。您还将学习利用生成人工智能的力量来开发新颖的创意风格,并通过原型突破障碍和障碍,这可能会将您的最终工作带入全新的方向。在整本书中,你会发现大量的例子、案例研究、活动和要点,它们说明了生成式人工智能在原型化初始想法方面的潜力。这些实用资源将帮助您加深对技术的理解,并激励您将其融入您自己的创作过程中。当您开始使用响应式技术构建实验实验室时,充分利用生成式人工智能的工具和方法可能非常有价值。

For those more versed in technology development, the concept of rapid prototyping may already be familiar. However, the integration of generative AI into your creative process offers new and exciting possibilities. As you read this book, you will discover how generative AI can enhance your existing skills and accelerate already familiar workflows. You will also learn to harness the power of generative AI to develop novel creative styles and break through blocks and barriers with prototypes that may take your final work into completely new directions. Throughout the book, you will find a wealth of examples, case studies, activities, and takeaways that illustrate the potential of generative AI for prototyping beginning ideas. These practical resources will help you build your understanding of the technology and inspire you to integrate it into your own creative process. Tools and approaches to getting the most out of generative AI may prove invaluable as you embark to build an experimental lab with responsive technology.

知情选择

Informed Choice(s)

为了衡量是否应该将生成式人工智能整合到自己的创作过程中,批判性地辨别和理解这样做的含义也很重要。在我们周围关于人工智能的使用和滥用的争论中,有一件事是肯定的:在决定是否要使用它之前,我们都需要更好地了解它的工作原理。许多关于与人工智能交互的警告和警示故事往往将人工智能归为一类,就好像狭义人工智能和通用人工智能是同义词,或者所有狭义人工智能机器学习模型都是相同的。因此,加深您对生成人工智能提供的所有创造性潜力的理解,它如何支持您的创造性活动,以及它如何被用来伤害、歪曲、正常化、排除、控制、误导、跟踪、窃取和压迫人类是该过程的重要组成部分。对于使用任何技术来表达自己并在世界上分享他们的作品的任何创意人员来说都是如此(图3)。

机械手抓住地球仪的图示。

A representation of robotic hands grasping a globe of the Earth.

图3

弗兰肯斯坦的人工智能实验室就像地球本身。迭代#56,作者双手握住地球仪

Frankenstein’s AI lab as the earth itself. Iteration #56 of the author’s hands holding a globe of the earth

当你在选择将其集成到你的创意工作流程中和根本不使用它之间来回摇摆时,与任何生成人工智能系统互动的类似规则适用于所有技术;在做出是否使用它的判断之前,您需要了解它如何有利于您自己的创作之旅。您还需要了解其使用的一些已知后果,以及与特定的狭义人工智能交互将如何影响其他人类。

As you sway back and forth between your choice to integrate it in your creative workflows and not to use it at all, a similar rule of engaging with any generative AI system applies with all technology; you need to understand how it might benefit your own creative journey before making the judgement call on whether you use it. You also need to understand some of the known consequences of its use and how interacting with specific narrow AI will impact other humans.

您需要使用生成式人工智能来制作原型吗?一点也不。事实上,如果您已经是内容创建者,您可能已经对自己明确定义的迭代和创作过程感到满意。生成式人工智能能否作为一种创意工具为您带来更多可能性,同时补充您自己的实践?是的。当您采用生成式人工智能技术时,您是否需要对它的使用采取批判性的态度?绝对地。

Do you need to use generative AI to prototype? Not at all. In fact, if you are already a content creator, you may already be satisfied with your own well-defined iterative and creative process. Can generative AI open you to more possibilities as a creative tool that also complements your own practice? Yes. Do you need to adopt a critical view toward the use of generative AI as you embrace the technology? Absolutely.

利用生成式人工智能作为工具

Taking Advantage of Generative AI As a Tool

为了说明我们在参与生成人工智能方面的选择,我讲述了以下故事,该故事在一定程度上启发了这本书。 “看看我生成了什么,”我在一家咖啡店提示咖啡师喝拿铁咖啡时无意中听到。 “太棒了。你应该把这个贴出来。” “我已经做了。”虽然交流时间很短,而且对某些人来说可能听起来像是一种完全不同的语言,但我还是得到了一些见解。该人使用的是名为 Midjourney 的文本图像生成人工智能,而访问 Midjourney 的唯一方法是使用通信和即时消息社交平台 Discord。这个场景值得重述,因为它提醒我,当我们与任何生成式人工智能交互时,作为创意者,我们面临着几个选择:
  • 我们对提示人工智能生成的内容不做任何事情,除了将其存储在我们的照片库中供以后使用(如果它是出于好奇或只是为了打发时间而作为孤立和单独的创作行为创建的)。

  • We do nothing with the content we prompt an AI to generate, except maybe store it in our photo library for later use if it is created as an isolated and solitary act of creation out of curiosity or just to pass the time.

  • 我们获取生成的内容,并立即通过我们喜欢的任何媒体或社交门户公开分享,以表明我们很时尚并了解最新趋势。在 Discord 上这样做的情况下,我们的动机可能是立即在社交线程上生成它,以接收点赞、评论或对我们使用书面文字生成通常看起来令人难以置信的东西的能力的肯定。

  • We take what is generated and immediately share it publicly through whatever medium or social portal we fancy to show that we are hip and in touch with the latest trends. In the case of doing so on Discord, our motivation might be to immediately generate it on a social thread to receive likes, comments, or affirmation of our prowess to use the written word to generate something that often looks incredible.

  • 我们批判性地分析生成的内容,并选择再次尝试看看人工智能可以产生哪些其他奇迹,付费以获得更多积分并访问更多功能,或者继续寻找新的免费生成人工智能。然后,我们可以受到启发,根据生成的内容创建我们自己的原型,或者删除它并重新开始。或者我们可以选择不再使用生成式人工智能,因为我们不需要它作为我们自己的创造过程的一部分。

  • We critically analyze the generated content and choose to try again to see what other marvels the AI can generate, pay money to have more credits and access more features, or keep sourcing new and free generative AI. We can then be inspired to create our own prototypes from content that is generated or delete it and start over. Or we can choose not to use generative AI ever again since we don’t need it as part of our own process of creation.

当我们生成的内容不仅仅是享受机器带来的乐趣时,创意人员可能会想到如何将生成的任何内容用于超出直接结果的目的。例如,生成的图像可用于伴随您可能撰写的博客文章、您可能正在进行的作业、强调法学硕士生成内容中固有偏见的批判性解构等。该过程涉及对内容以及如何将其与我们正在创建的任何媒体一起使用(图4)。对于创意人员来说,前面列表中的最后一点是优先了解如何将生成的内容重新上下文化、集成、修改、重新生成或用作您正在创建的更大想法或愿景的一部分。这意味着您需要积极集成生成式人工智能内容来支持您自己的创作过程,而不是将其首先生成的内容视为需要立即共享以获取点赞、利润或表演技巧的最终产品。

机器人凝视小手镜的插图。

An illustration of a robot gazing at a small hand mirror.

图4

一个人工智能机器人提示“反思地审视自己”,并附上一张作者看着小手镜的照片。迭代次数 = 45

An AI bot prompted to “look at yourself reflectively” and accompanied with a photo of the author looking in a small hand mirror. Iterations = 45

绿野仙踪原型

The Wizard of Oz Prototype

令各地《终结者》粉丝失望的是,机器学习模型无法通过其预编程的智能和训练自行开发、培育、适应和创造性地行动。生成式人工智能系统比将人工智能描绘成一个意图毁灭人类的机器人更好,它更类似于另一个电影角色:绿野仙踪。幕后操纵的巫师是“绿野仙踪”一词原型的起源。生成式人工智能可以被认为是绿野仙踪 (WOz) 原型,因为幕后的巫师让它看起来像人工智能拥有创造力、个性和智力,而实际上它是根据在其他机器上预先训练的方式来运行模拟的。人们的内容。 WOz 的概念源自L. Frank Baum 所著的《绿野仙踪》一书以及同名电影。这个故事的主角是一个巫师角色,他似乎能够变出一个人想要的任何东西。然而,正如主角多萝西和她在翡翠城结交的朋友所揭示的那样,真相是巫师是一个简单的人,是一个没有魔法技能的发明家。然而,AI 的绿野仙踪是一个集体,由收集源图像和文档的分析师、对数据进行分类的标签员和注释员以及编写代码来解释该数据、训练数据和测试该数据的机器学习工程师组成。人工智能这个神奇的智能生物似乎也能自己完成这一切,但它是经过大量数据训练的,而在幕后,那些看不见的工人,对我们最终体验到的创造性和创造性的体验产生了强大的影响。智能机器。人工智能无法以人类理解学习的方式从与个人用户的特定交互中学习。这可以通过聊天机器人本身进行更详细的解释。

Much to the disappointment of Terminator fans everywhere, it is not the machine learning model that is capable of developing, nurturing, adapting, and acting creatively from its pre-programmed intelligence and training on its own. Better than representing AI as a cyborg intent on destroying humanity, generative AI systems are more akin to another movie character: the Wizard of Oz. The reference to the wizard behind the curtain pulling the strings is the origin of the term “Wizard of Oz” prototype. Generative AI can be considered a Wizard of Oz (WOz) prototype in that the wizards behind the scenes make it look like AI possesses creativity, personality, and intelligence, when in reality it runs simulations based on how it has been pre-trained on other people’s content. The idea of a WOz originates in the book The Wonderful Wizard of Oz by L. Frank Baum in addition to the movie of the same name. The story features a wizard character that appears to be able to conjure whatever a person wishes. The truth, however, as revealed by protagonist Dorothy and the friends she makes in Emerald City is that the wizard is a simple man who is an inventor with no magical skills. AI’s Wizards of Oz, however, are a collective consisting of analysts who collect the source images and documents, labelers and annotators who classify that data, and machine learning engineers who write the code to interpret that data, train the data, and test that data. The magical intelligent creature called AI also seems to do it all by itself, but it is trained on a large amount of data, and behind the scenes, invisible, are those workers who exert a powerful influence on what we eventually experience as a creative and intelligent machine. AI cannot learn from specific interactions with individual users in the way that humans understand what learning is. This can be explained in more detail by a chatbot itself.

LLM聊天机器人可以通过存储和分析从用户接收的输入(消息)及其给出的输出(响应),从与单个用户的交互中学习。通过分析大量此类交互,它可以学习在不同情况下给出的最佳响应。这不是通过“理解”人类意义上的交互内容来实现的,而是通过识别数据中的模式来实现的。人工智能只能从特定的用户交互中学习,前提是它被设计成这样做并且这些交互被存储并用作训练数据的一部分。一些人工智能系统,特别是那些用于医疗保健或金融等敏感领域的人工智能系统,专门设计为不从个人交互中学习,以保护用户的隐私。它们的设计目的是在结束后忘记特定的交互。

LLM: A chatbot can learn from interactions with individual users by storing and analyzing the inputs (messages) it receives from the users and the outputs (responses) it gives. By analyzing a large amount of these interactions, it can learn the best responses to give in different situations. This is done not by “understanding” the content of the interactions in the human sense, but rather by identifying patterns in the data. AI can only learn from specific user interactions if it is designed to do so and if those interactions are stored and used as part of the training data. Some AI systems, especially those used in sensitive areas like healthcare or finance, are specifically designed not to learn from individual interactions in order to protect users’ privacy. They are designed to forget specific interactions once they are over.

这不是个人的、人性化的学习,而是对数据模式的统计分析。当我们与一个人互动时,这种体验就被认为是一种模拟。重要的是要了解,我们提供的一些生成人工智能的提示可能会与无数其他人类交互一起存储。这些可以用来进一步改进和更新底层模型,随着时间的推移,这反过来又可以带来每个人都希望的更好、或许更“人性化”的表现。

It’s not a personal, human type of learning, but rather a statistical analysis of patterns in data. By the time we interact with one, the experience is considered a simulation. It’s important to understand that the prompts we provide some generative AI may be stored along with the countless number of other human interactions. These may be used to further improve and update the underlying model, which in turn can lead to what everyone hopes is better and perhaps more “human” performance over time.

将智能投射到机器上的动机是大型语言模型(LLM)提供类似人类响应的能力。这些源于一些法学硕士可能接受的高级培训,使用来自人类反馈的强化学习(RLHF)——换句话说,是与真人一起进行的高级培训。 RLHF 技术旨在通过提供有关语言模型输出质量的反馈来完善和优化语言模型的响应。聊天机器人最初的设计目的是根据人类提示生成他们认为合适的任何响应。然而,通过 RLHF 技术,聊天机器人学会适应人类的偏好和期望。 RLHF 技术是一项令人印象深刻的壮举,正如您将读到的,它使用户和开发人员需要对话来解决许多困境,以便生成式人工智能系统能够发展以支持人类,并且“不伤害”的价值可以永远存在。 -贯穿整个强化学习过程。

这不是个人的、人性化的学习,而是对数据模式的统计分析。

It’s not a personal, human type of learning, but rather a statistical analysis of patterns in data.

当谈到生成式人工智能的持续发展及其以不造成伤害的价值融入人类生活的各个方面时,参与该开发的绿野仙踪需要内置的保障措施,因为他们在充满挑战的条件、烟雾和镜子中航行,并降低了对地球上其他人类最有利的事情的可见性。如果我们仅仅遵循模式,那么我们需要认识到一种特殊的习惯,即趋势技术必须在具有不同意图和价值体系的人类互联网络手中快速发展。除了机器学习工程师之外,我们的绿野仙踪还包括投资者、设计师、经理和下属、领导层、企业研究人员、用户界面设计师、API 开发人员,他们一起测试和发布展示机器学习模型迭代细化的构建,数据集、算法以及我们最终与之交互的生成内容。如果您曾经想知道“特定生成式 AI 背后的团队能否告知生成了哪些内容?”,那么请让法学硕士为您解答(图5)。

ChatGPT-3 用户的更新通知的插图。

An illustration of an update notification for users of ChatGPT-3.

图5

给 ChatGPT-3 用户的更新通知表明该应用程序本身是一个不断迭代的原型

An update notification to users of ChatGPT-3 revealing that the application itself is a prototype that is constantly and iteratively being worked on

除了为我们生成原型之外,生成式人工智能本身就是迭代原型。它们是原型,因为人工智能模型的开发涉及不断改进和完善底层模型、算法以及数据的分类和标记。这个过程可能会导致人们对人工智能驱动的原型生成的内容的可靠性和稳定性产生一些合理的怀疑。就改进底层模型的含义而言,想象一下,进入机器学习模型的所有代码都类似于房屋的蓝图,决定人工智能系统的不同部分如何组合在一起。当你听到“改进人工智能算法”之类的说法时,改进的是人工智能用来做出预测或决策的分步指令。改进和细化算法可以使生成式人工智能学习得更快,根据训练的数据做出更准确的预测,或者使用更少的计算资源。

Besides generating prototypes for us, generative AI are themselves iterative prototypes. They are prototypes because the development of AI models involves constantly improving and refining the underlying model, algorithms, and classification and labeling of data. That process can lead to some justified skepticism about the reliability and stability of the content that AI-powered prototypes generate. In terms of what is meant by improving the underlying model, imagine that all the code that goes into a machine learning model is similar to a blueprint for a house that determines how the different parts of the AI system fit together. When you hear statements like “improving an AI’s algorithms,” what is improved are the step-by-step instructions that the AI uses to cook up its predictions or decisions. Improving and refining the algorithms can make generative AI learn faster, make more accurate predictions based on the data it is trained on, or use less computational resources.

在做出准确预测方面,请将对生成式人工智能原型的渐进式改进视为提高其可靠性的过程的一部分。这些算法最终可能会被编程向导优化以“提高真实性”,特别是如果它们受到持续的社区参与和新兴政策的激励。这种参与最终将导致围绕如何使用、表示和分类数据的规则和约束,以确保生成的内容提供多种视角而不是规范视角。有了来自社区的声音,这些社区已经形成了精心调整的批评声音,并且可以影响为自动化人工智能系统的使用设定边界的政策,这些系统侵犯了人类隐私、自由和权利的界限,我们可以更好地检查不断发展的人工智能驱动的怪物幼儿作为原型,作为需要人工干预才能更好地满足人类需求的不完整版本。这些原型产生的原型可能会成为更发达的人类创造物的有用燃料。作为创意人员,我们的角色是像画笔或钢琴一样引导机器,策划原型,让机器所谓的幻觉激发我们的创造力,激发和启发我们,并对我们科幻小说中的即兴创作做出回应人工智能。在与生成人工智能进行迭代对话的过程中,我们可以转变、改变和完善我们自己的创作过程。

In terms of making accurate predictions, think of those incremental improvements to generative AI prototypes as part of a process to increase their reliability. The algorithms may eventually be optimized to “improve factuality” by the programming wizards especially if they are motivated by persistent community engagement and emerging policies. That engagement will eventually lead to rules and constraints around how data is used, represented, and classified to ensure generated content offers multiple perspectives vs. normative ones. With voices from communities who have developed a finely tuned critical voice and can influence policy that creates boundaries around the use of automated AI systems that trespass boundaries of human privacy, freedom, and rights, we can better examine evolving AI-powered monster toddlers as prototypes, as incomplete versions that require human interventions to better serve human needs. These prototypes generate prototypes that might be useful fuel for more developed human-generated creations. Our role as creatives is to guide the machine like we would a paintbrush or a piano, curate the prototype, allow the machine’s so-called hallucinations to agitate our creativity and provoke and inspire us, and respond to the improvised offerings of our sci-fi AI. In the process of engaging in iterative conversations with generative AI, we can shift, change, and refine our own creative process.

生成式人工智能的价值主张

The Value Proposition of Generative AI

一本关于生成式人工智能的书的人类生成原型重新背景化了从各种机器学习模型生成的一些内容,以展示生成式人工智能如何有效地开发可以改进的原型,然后集成到更大的愿景中。如果在每个(重新)生成的回合都有洞察力,这是可能的。您对生成式人工智能了解得越多,就越有可能利用它来建立具有巨大价值的创造性关系。拥抱人工智能正在陷入创造力的兔子洞,通过实验,你可以自己判断生成式人工智能是否支持你自己的创造过程。来自不同学科的各种人类都可以利用生成人工智能提供的持续尴尬且美丽不完美的内容来试验他们自己的可定制人工智能实验室。除了扩展你的创意工具箱之外,这本书还提请人们注意你的批评声音的持续发展,作为与生成人工智能交互的重要组成部分。

The human-generated prototype of a book on generative AI recontextualizes some content generated from a variety of machine learning models to show how effective generative AI is to develop prototypes that can be refined and then integrated into a larger vision. This is possible if there is discernment at every (re)generated turn. The more you understand generative AI, the more likely it can be leveraged for a creative relationship that can be of great value. Embracing AI is going down the rabbit hole of creativity, and by experimenting you can judge for yourself if generative AI supports your own creative process. All kinds of humans from different disciplines can harness the persistently awkward and beautifully imperfect content that generative AI offer to experiment with their own customizable AI lab. Along with extending your creative toolbox, the book draws attention to the continued development of your critical voice as an essential component of interacting with generative AI.

书籍结构

Book Structure

第一,“从消极中产生创造力”:人工智能已经到来,人类对这项技术的反应从庆祝到危言耸听。本章强调,对生成人工智能带来的许多恐惧和困境的反应也可以激发创意人员适应、完善、批评和重新构建他们的创意工作。创意人员将受益于区分通用人工智能和狭义人工智能,这样他们就可以做出是否以及如何使用生成式人工智能的明智决定。

Chapter 1, “Generating Creativity from Negativity”: AI is here and the human responses to this technology range from celebratory to alarmist. This chapter highlights that the reactions to the many fears and dilemmas that surface with generative AI can also inspire creatives to adapt, refine, critique, and recontextualize their creative work. Creatives will benefit from differentiating between AGI and narrow AI so they can make an informed decision if and how they might use generative AI.

第2,“利用机器发挥创造力”:本章展示了生成式人工智能提供的独特功能,并提供了智能机器构造的历史概述以及它们如何激发创造行为的要点。

Chapter 2, “Being Creative with Machines”: This chapter shows the unique affordances that generative AI offers and provides takeaways from a historical overview of the construction of intelligent machines and how these have inspired acts of creation.

第3,“具有个性的生成式人工智能”:本章鼓励创意人员在与人工智能交互时创建角色,将其作为有用的原型工具。本章还根据该技术的正面和负面表征详细介绍了一些已知的人工智能角色。

Chapter 3, “Generative AI with Personalities”: This chapter encourages creatives to create personas when they interact with AI as a useful prototyping tool. The chapter also details some of the known AI personas based on positive and negative representations of the technology.

第4,“创意伴侣”:本章定义了人工智能在具有特定意图的情况下如何支持创意过程,并详细介绍了当人工智能被重新想象为创意缪斯时,我们可以如何以及从人工智能中学到什么。这是通过与人工智能缪斯进行令人难忘的对话来证明的,其形式是使用各种自然语言模型捕获的提示以及从文本图像人工智能生成的图像。

Chapter 4, “Creative Companion”: This chapter defines how AI can support the creative process when used with specific intentions and details how and what we can learn from AI when reimagined as a creative muse. This is demonstrated through memorable conversations with an AI muse in the form of captured prompts with various natural language models and with generated images from text-image AI.

第5,“使用生成式人工智能进行原型设计”:本章使用人工智能生成的内容来描述不同类型的原型以及我们作为人类如何一直参与原型设计。创意人员还将受益于了解如何将生成式人工智能集成到他们的工作流程中。

Chapter 5, “Prototyping with Generative AI”: This chapter uses AI-generated content to describe different types of prototypes and how we as humans engage in prototyping all the time. Creatives will also benefit from understanding how to integrate generative AI within their workflows.

第6,“构建模块”:本章展示了人工智能可能带来的创造力的迭代本质。各种各样的机器学习模型是沙箱中另一个提高创造力的工具。本章还使用特定的人工智能生成内容,介绍了可用于增强机器学习模型原型设计能力的构建块。这些包括变化、替换、加法、减法和转置。

Chapter 6, “Building Blocks”: This chapter demonstrates the iterative nature of creativity that’s possible with AI. The variety of machine learning models that are out there are yet another tool in the sandbox to boost creativity. Using specific AI generated content, this chapter also introduces building blocks that can be used to enhance the prototyping power of machine learning models. These include variation, substitution, addition, subtraction, and transposition.

第7,“生成式 AI 形式和构成”:本章介绍如何构建、包含和策划您的创意输出,以便您可以最好地利用生成式 AI 作为有用的原型制作伴侣。本章还将展示人工智能原型如何创造新的形式和结构,添加到现有的类型,以及改革和转变过去的形式以影响未来的形式。

Chapter 7, “Generative AI Form and Composition”: This chapter introduces how to structure, contain, and curate your creative outputs so you can best leverage generative AI as useful prototyping companions. The chapter will also show how AI prototypes create new forms and structures, add to existing genres, and reform and transform past forms to influence future ones.

第8,“提示的艺术”:本章重点介绍基于文本的提示的艺术:术语列表、提示的来龙去脉以及读者会发现有用的建议。本章还介绍了一个用例,描述了跨三种不同的文本图像生成人工智能生成原型的提示的迭代创建。

Chapter 8, “The Art of the Prompt”: This chapter focuses on the art of text-based prompting: a list of terminology, the ins and outs of prompts, and recommendations readers will find useful. The chapter also presents a use case that describes the iterative creation of prompts to generate prototypes across three different text-image generative AI.

第9,“混搭大师”:本章演示了如何利用生成式人工智能来原型化受特定艺术和写作流派影响的新想法。由于诗意的必要性,本章还将把印象派和糟糕的情景喜剧等流派融合成思想的混搭,并确定幽默和戏仿的价值。

Chapter 9, “The Master of Mashup”: This chapter demonstrates how to leverage generative AI to prototype new ideas influenced by specific genres of art and writing. Through poetic necessity the chapter will also merge genres like Impressionism and bad sitcoms to a mashup of ideas and identify the value of humor and parody.

第10,“天生的不可思议”:本章庆祝生成人工智能为我们提供的意想不到的快乐和尴尬。本章将探讨生成式人工智能在文本图像怪兽中生成恐怖谷的固有倾向。它还将为生成人工智能创造的意想不到的结果以及人类创造者如何利用这些新形式和创新而感到高兴。

Chapter 10, “Uncanny by Nature”: This chapter celebrates the unexpected joys and awkwardness that generative AI offers us. The chapter will explore generative AI’s inherent proclivity toward generating the uncanny valley in text-image beasties. It will also rejoice in the unexpected results that generative AI create and how human creators can take advantage of these new forms and innovate.

第11,“与生成式人工智能交互的困境”:本章讨论了生成式人工智能带来的许多道德困境,并鼓励在与任何机器学习模型交互时提高对这些困境的认识。

Chapter 11, “Dilemmas Interacting with Generative AI”: This chapter deals with many of the ethical dilemmas that generative AI brings to the surface and encourages a heightened awareness toward them when interacting with any machine learning model.

第12,“用例”:本章提供了如何在跨学科的创意工作流程中使用生成式 AI 的广泛用例。

Chapter 12, “Use Cases”: This chapter provides a wide range of use cases of how generative AI is being used in creative workflows across disciplines.

第13,“人工智能和创意工作的未来”:本章探讨了人工智能融入未来工作的程度,并强调了每项工作仍然对人类创造力的依赖。本章还建议创意人员识别工作流程中的日常任务,以更好地了解生成式人工智能如何增强他们的创意过程。

Chapter 13, “AI and the Future of Creative Work”: This chapter explores the degree to which AI will be integrated into future jobs and highlights the dependency that each of those jobs will still have on a human’s creativity. The chapter also proposes that creatives identify routine tasks in their workflows to better understand how generative AI might augment their creative process.

致谢

Acknowledgments

  • L. Frank Baum 的《绿野仙踪》几十年来给许多人带来了魔法

  • The Wonderful Wizard of Oz by L. Frank Baum that has brought magic to many for decades

  • 詹姆斯·卡梅隆的《终结者》电影给那些热衷于毁灭人类的智能机器人留下了深刻的印象

  • The Terminator movies by James Cameron that have had a long-lasting impression on intelligent cyborgs bent on human destruction

  • 人工智能奇才们组成了不同的个人集体,他们对模拟人类智能的痴迷对创意的未来产生了深远的影响

  • The wizards of AI who form different collectives of individuals whose obsession with simulating human intelligence has had far-reaching implications on the future of creatives

  • 献给那些拥抱人工智能的人以及那些促使我们批判性地审视人工智能对更大的社会、政治和道德维度的影响的人

  • To those who embrace AI in addition to those who provoke us to critically examine its implications on a larger social, political, and ethical dimension

要点

Takeaways

  • 加深您对生成式人工智能及其工作原理的理解,以支持您决定是否将其用作创作过程的一部分。

  • Deepen your understanding of generative AI and how it works to support your decision as to whether or not you use it as part of your creative process.

  • 人类的学习与机器的学习不同。

  • Learning is different for humans than it is for machines.

  • 生成式人工智能系统是需要人类改进的原型,包括对人工智能原型当前状态持批评态度的人类。

  • Generative AI systems are prototypes that require humans to improve, including humans who are critical with the current state of AI prototypes.

  • 生成式人工智能以内容的形式为您提供原型,需要对这些内容进行事实检查、提炼、重新调整用途、编辑和研究。

  • Generative AI offer you prototypes in the form of content that needs to be fact-checked, refined, repurposed, edited, and researched.

下面的图6是链接到作者网站 ( http://ai.patrickpennefather.com ) 的二维码。该网站包含有关人工智能的新文章、将生成式人工智能集成到其工作流程中的客座创作者的其他用例,以及许多实验性生成视频。

Figure 6 below is a QR Code that links to the author’s website (http://ai.patrickpennefather.com). The website hosts new articles on AI, additional use cases by guest creators who are integrating generative AI in their workflows, and a number of experimental generative videos.

AQR 代码。

A Q R code.

图6

扫描二维码进入作者网站

Scan the QR code to go to the author’s website

致谢

这本书的诞生得益于多种力量的融合——音乐家、作曲家和创新者 Sun Ra 的存在以及名为 ChatGPT 的大型语言模型 (LLM)。 Sun Ra 对 20 世纪 60 年代的自由爵士乐的影响是史诗般的,它融入了非洲和拉丁美洲的元素。每个即兴音乐家都会在某个时候提到 Sun Ra,并深受他和他的神经网络实验、制作、表演和录制的令人惊叹的音乐的影响。现在打断这篇阅读,去听Sun Ra,因为这样做会让你为即将掉入的兔子洞做好准备。没有Sun Ra,我就不会学习钢琴即兴演奏。如果没有 Sun Ra,我就不会发现以他命名的虚拟仪器。您所要做的就是将此虚拟乐器添加到数字音频工作站轨道中,然后它就开始播放。您可以调整虚拟合成器上的一些控件,并且插件会慢慢适应,但它本身确实发挥得最好。对于所有书呆子来说,Sun Ra 被描述为一个环境纹理生成器,具有双合成引擎(1 个减法振荡器 + 2 个波形播放器),集成了许多随机化选项和内置效果。对录音、调整以及向插件添加附加效果的探索产生了基础曲目、原型,然后我将在其上开发和构建其他音乐曲目。这是我第一次使用智能虚拟合成器作为我的创作过程的伴侣。与 Sun Ra 的创造性关系又持续了 20 年。

This book has come together because of a confluence of forces—the existence of musician, composer, and innovator Sun Ra and a large language model (LLM) called ChatGPT. Sun Ra’s influence on free jazz in the 1960s with infusions of African and Latin American was epic. Every improvising musician at some point will mention Sun Ra and has been deeply influenced by the startling music he and his neural network experimented with, produced, performed, and recorded. Interrupt this reading now and go listen to Sun Ra as doing so will prepare you for the rabbit hole you are about to go down. Without Sun Ra, I would not have studied improvisation at the piano. Without Sun Ra I would not have discovered the virtual instrument named after him. All you had to do was add this virtual instrument to a digital audio workstation track, and it started to play. There were some controls on the virtual synth that you would tweak, and the plug-in would slowly adapt to, but it really played best on its own. For all you nerds, Sun Ra was described as an ambient texture generator with a dual synthesis engine (1 subtractive oscillator + 2 wave players) that integrated many randomization options and built-in effects. Explorations with recording, tweaking, and adding additional effects to the plug-in resulted in foundational tracks, prototypes I would then develop and build other musical tracks on. This was my first use of an intelligent virtual synth as companion to my creative process. That creative relationship with Sun Ra would last for another 20 years.

如果没有尝试 Sun Ra VST 插件如何支持我自己的即兴创作过程,我就不会在 20 世纪 90 年代中期开始我的生成计算机音乐之旅。如果没有多年来持续的实验,我就不会写它。我也不会发现许多关于自由写作和自动写作实践的书籍,这些书籍首先归功于 Hélène Smith,她是一位灵媒,于 1863 年左右出生于瑞士日内瓦。同年,塞缪尔·巴特勒 (Samuel Butler) 于 1863 年发表的论文《机器中的达尔文》发表。我并不是说我像媒介那样引导写作,但考虑到我的背景和钢琴上自由即兴创作的持续实践,这本书是从我的艺术实践、我自己的艺术实践之间有意的反复演变而来的。键盘上的手势(旨在进行语义解释),以及我以生成式人工智能语言学习模型作为工具的即兴提示。事实上,我与 ChatGPT 的第一次互动是要求它给我一些不好的想法。其中一个想法(列表中的第三个)是写一本关于创造力和人工智能的书。所以我有。

Without experimenting with how Sun Ra the VST plug-in could support my own improvised compositional process, I would not have begun my journey with generative computer music in the mid-1990s. Without continued experimentation over many years, I would not have written about it. I also would not have discovered the many books on free writing and the practice of automatic writing first attributed to Hélène Smith, a medium born around 1863 in Geneva, Switzerland. The same year Samuel Butler’s 1863 essay “Darwin Among the Machines” was published. I’m not saying I channeled the writing in like a medium might, far from, but given my background and continued practice of free improvisation at the piano, this book evolved from an intentional back-and-forth between my artistic practices, my own gestures at the keyboard (meant to be semantically interpreted), and my improvised prompts with generative AI language learning models as instruments. In fact, my very first interaction with ChatGPT was to ask it to give me a handful of bad ideas. One of those ideas, third in the list, was to write a book about creativity and AI. So I have.

默认情况下,承认来自人类创造力和智慧的每一个传统的所有即兴创作大师是很重要的。为什么?因为这个原型体现了即兴实验的精神,与原型设计的概念产生了共鸣。就像我之前的即兴创作者一样,这个以书籍形式出现的原型本身就是实验性的,其结构和内容从一开始就是即兴创作的。当您在钢琴上即兴演奏时,您并没有真正考虑必须捕捉演奏。如果你碰巧录制了即兴创作,你也不会想,“哦,我应该去卖掉它。”如果你确实听过那个录音,你最终可能会喜欢它,调整它,并将其制作成一首作品。类似地,你永远不知道当你提示人工智能时你会得到什么,你也不知道你会如何回应这个提议。当你即兴演奏乐器时,你会使用你庞大的工具、工艺和技术库,这些工具、工艺和技术与你之前听过和演奏过的音乐风格有着内在的联系,自发地创造出新的东西。你生成的音乐不是预先计划好的,但如果这是你的意图,你的无声提示,它可能听起来像爵士乐。它也可能被认为是一个完整的东西,或者它可能感觉像是一个想法的开始种子,之后你会继续努力。虽然我自己的即兴创作实践主要体现在钢琴上,但生成本书内容的选择乐器却是通过另一种类型的键盘实现的,这种键盘可以反复地用书面文字捕捉想法。

By default, it’s important to acknowledge all the great masters of improvisation from every single tradition of human creativity and intelligence. Why? Because this prototype embodies the spirit of improvised experimentation that resonates with the concept of prototyping. Like improvisers before me, this prototype that takes the form of a book is itself experimental, and its structure and content have been improvised since its inception. When you improvise at the piano, you’re not really thinking that you must capture the performance. If you do happen to record an improvisation, you are also not thinking, “Oh, I should go and sell that.” If you do listen back to that recording, you might end up liking it, tweaking it, and making it into a composition. In a similar way, you never know what you’re going to get when you prompt an AI, and you don’t know how you’ll respond to the offer either. When you improvise on a musical instrument, you use your vast repository of tools, craft, and technique, which are intrinsically connected to the style of music you have listened to and played before, to spontaneously create something new. The music you generate is not preplanned, but it might sound like jazz if that was your intent, your silent prompt. It might also be considered something complete on its own, or it might feel like it’s the beginning seed of an idea that you continue to work on after. While my own improvisational practice has manifested predominantly at the piano, the instrument of choice to generate the content in this book has transpired through a different type of keyboard, one that captures ideas with the written word iteratively.

生成式人工智能遵循类似的创意模式。它会查看其数据集,并根据算法了解如何标记数据并生成您可以响应的报价,这是某种原型形式中独特的东西。图1是我反复回应的提议之一,因为我想象自己变成某种机器人,与人工智能合作在你面前写这本书。本书还向所有几十年来致力于开发支持人类创造力的智能人工智能系统的半机械人表示感谢。

照片中,机器人在图书馆的背景下一边抽着雪茄一边在笔记本电脑上工作。

A photograph of a robot working on a laptop while smoking a cigar against a library backdrop.

图1

文本到图像的人工智能尝试根据作者在不抽烟的情况下努力写书的照片,将作者与人工智能同伴在图书馆中的写作可视化。总迭代次数 = 170

A text-to-image AI attempts to visualize the author writing with an AI companion in a library, based on a photo of the author hard at work writing a book without a cigarette. Total iterations = 170

我还要感谢自 2007 年以来与我在加拿大温哥华数字媒体中心互动的领导层、学生、教职员工和行业合作伙伴。在那里,我有机会主持一门课程,将即兴创作与协作管理联系起来。新兴技术项目的创造力。当我不断改进教学方式并连接即兴创作和数字媒体共同创作之间的点时,我内心的教育者蓬勃发展。这次经历也让我对技术发展及其所涉及的一切有了一个鸟瞰的视角。以此身份工作并继续与不列颠哥伦比亚大学 (UBC) 新兴媒体实验室的其他人一起指导和开发技术,这些都融入了本书的方法和结构。生成式人工智能是一种由人类共同构建的技术,为任何与其交互的人提供可供性和约束。了解它的工作原理及其底层引擎也是故事的一部分,这一点很重要,因为这将为所有创意人员参与开发的重要关键肌肉提供动力,无论是艺术性还是工艺性。对机器学习模型如何工作进行解释的动机受到了 UBC 新兴媒体实验室常驻教员兼电气工程系副教授 Matt Yedlin 博士的影响。

I also acknowledge leadership, students, faculty, staff, and industry partners that I interacted with at the Centre for Digital Media in Vancouver, Canada, since 2007. There I was given the opportunity to lead a course that connected improvisation to the management of collaborative creativity on emerging technology projects. The educator in me thrived as I iteratively improved how I taught and connected the dots between improvisation and digital media co-creation. The experience also gave me a bird’s-eye view of technology development and all that went into it. Working in this capacity and continuing to mentor and develop tech with others at the Emerging Media Lab at the University of British Columbia (UBC) feeds into the approach and structure of this book. Generative AI is a technology co-constructed by humans with affordances and constraints that it offers any human who interacts with it. Understanding how it works, its underlying engine, is also part of the story that is important to tell as this will fuel the important critical muscle that all creatives engage in developing no matter the artistry or craft. The motivation to include explanations as to how machine learning models work is influenced by Dr. Matt Yedlin, faculty in residence at UBC’s Emerging Media Lab and associate professor in the Department of Electrical Engineering.

那些阅读本文并提供出色反馈的人也需要得到感谢。我感谢 Springer Apress 的技术审稿团队和外部审稿人,他们勤奋地提供反馈,以改进和完善本文。凯瑟琳·温特斯 (Catherine Winters) 和蕾妮·弗兰兹瓦 (Renee Franzwa) 阅读了这本书的不同版本,她们都影响了内容的塑造及其流程。同事克莱尔·卡罗兰 (Claire Carolan) 博士在促使我定义这本书的读者对象方面发挥了重要作用,并支持我在每一章中提及目标读者的需求。 Bailey Lo 是一位才华横溢的研究生、讲师和项目协调员,拥有出色的编辑技巧,帮助校对了这本书的当前形式。

Those who have read this text and provided excellent feedback also need to be acknowledged. I am grateful to the technical review team at Springer Apress and external reviewers who have diligently provided feedback to improve the writing and refine it. Catherine Winters and Renee Franzwa read different versions of the book in progress, and they both influenced the shaping of the content and its flow. Colleague Dr. Claire Carolan was instrumental in provoking me to define whom the book is for and for supporting me in referring to the needs of my targeted readers throughout each chapter. Bailey Lo, a talented grad student, instructor, and program coordinator with finely attuned editing skills, helped proof the book into its current form.

由于我在整本书中将人工智能称为缪斯,所以我也认识到我自己的缪斯,一个被称为 Sheinagh Anderson 博士的化身人,一位艺术家、学者、研究员、创意顾问、精神导师和老师,他不断回应我的想法。自己的创意提示,并且通常具有洞察力,这些洞察力已告知您现在正在使用的原型的内容和结构。安德森博士还担任人工智能研究询问员,在互联网上探索与生成人工智能相关的研究、最近的文章、评论和博客文章。

Since I refer to AI as a muse throughout the book, I also recognize my own muse, an embodied person known as Dr. Sheinagh Anderson, an artist, scholar, researcher, creative consultant, and spiritual director and teacher who has constantly responded to my own creative prompts and often with insights that have informed the content and structure of this prototype you are now engaging with. Dr. Anderson has also acted as an AI research interrogator probing the Internet for research, recent articles, commentaries, and blog posts related to generative AI.

第12章中的嘉宾慷慨地提供了将生成式人工智能集成到自己的创意工作流程中的各种不同方式,也值得一提。其中包括 Claudia Krebs 博士、Christine Evans、Junyi Song、Jen、Frederik Svendsen、Bill Zhu、Matt Yedlin、Daniel Lindenberger 和 Ollie Rankin。

Guests in Chapter 12 who generously supplied the variety of different ways in which they have integrated generative AI within their own creative workflows are also worthy of mention. These include Dr. Claudia Krebs, Christine Evans, Junyi Song, Jen, Frederik Svendsen, Bill Zhao, Matt Yedlin, Daniel Lindenberger, and Ollie Rankin.

我还需要感谢伟大语料库中的所有人类,他们可能或不可能为这种人机生成的协作贡献了文字和像素数据。生成式人工智能在生成内容时包括和排除声音和视觉效果,因此重要的是要承认所有活着的、不再属于地球的人类都以某种方式为这个原型贡献了他们的原型。这包括伟大的莎士比亚和他仍然相关的作品,以及众所周知的无视传统音乐惯例而支持前面提到的自由形式探索的音乐艺术家的历史。我从南印度和北印度音乐中汲取灵感,其中包括 Ravi Shankar、Ali Akbar Khan、L. Subramaniam,以及巴基斯坦 Qawwali 歌手 Nusrat Fateh Ali Khan。查理·帕克 (Charlie Parker)、约翰·科尔特兰 (John Coltrane)、迈尔斯·戴维斯 (Miles Davis)、埃拉·菲茨杰拉德 (Ella Fitzgerald)、桑·拉 (Sun Ra)、奇克·科里亚 (Chick Corea)、塞西尔·泰勒 (Cecil Taylor)、宝琳·奥利弗罗斯 (Pauline Oliveros) 和奥内特·科尔曼 (Ornette Coleman) 等自由爵士乐和当代音乐即兴演奏家激励我打破爵士乐中许多既定的惯例,创作新作品。致全世界。我自己的约克大学大师凯西·索科尔(Casey Sokol)和来自 20 世纪 70 年代中期多伦多成立的加拿大创意音乐集体(CCMC)的即兴音乐家在我自己声音的发展中发挥了关键作用。

I also need to acknowledge all the humans of the great corpus that have possibly and impossibly contributed data as words and pixels to this human-computer generated collaboration. Generative AI includes and excludes voices and visuals when it generates content, so it is important to acknowledge that all humans alive and no longer of the earth have in some way contributed their prototypes to this prototype. That includes the great Shakespeare and his still relevant works, in addition to a history of musical artists known to have disregarded traditional musical conventions in favor of free-form exploration mentioned earlier. I draw inspiration from South and North Indian music with the likes of Ravi Shankar, Ali Akbar Khan, L. Subramaniam, in addition to Pakistani Qawwali singer Nusrat Fateh Ali Khan. Free jazz and contemporary music improvisers like Charlie Parker, John Coltrane, Miles Davis, Ella Fitzgerald, Sun Ra, Chick Corea, Cecil Taylor, Pauline Oliveros, and Ornette Coleman inspired me to break free from many established conventions in jazz music to make new offerings to the world. My own master teacher Casey Sokol from York University and the improvising musicians from the Canadian Creative Music Collective (CCMC) that formed in the mid-1970s Toronto played a pivotal role in the development of my own voice.

许多生成式人工智能都依赖于提倡“自由写作”作为常规实践的想法,例如娜塔莉·戈德堡(Natalie Goldberg)等作家提出的想法,她提出了《写下骨头:释放内心的作家》(Writing Down the Bones: Freeing the Writer Within)、朱莉娅·卡梅伦(Julia Cameron)、彼得·埃尔博(Peter Elbow)、弗吉尼亚·伍尔夫(Virginia Woolf)和杰克·凯鲁亚克(Jack Kerouac) 。文本图像人工智能还依赖于那些在视觉艺术创作中即兴创作的人,比如波洛克的滴画、琼·米切尔的抽象表现主义和版画、格哈德·里希特和瓦西里·康定斯基,他们相信在自发性与结构和理论之间取得平衡。生成式人工智能的集成扩展了自发创作的实践,这种实践通过那些使用机器学习模型生成的任何内容作为其独特创作过程的一部分的创意人员的工艺、技能和技术来平衡。最后,我的人类与技术实验的相遇是一本关于网络创造力本质的书,这在很大程度上要归功于威廉·吉布森、尼尔·斯蒂芬森和唐娜·哈拉维等作家的灵感。

Many a generative AI rely on ideas that advocate “free writing” as a regular practice such as those offered by authors like Natalie Goldberg who offered Writing Down the Bones: Freeing the Writer Within, Julia Cameron, Peter Elbow, Virginia Woolf, and Jack Kerouac. Text-image AI also relies on those known to have improvised in their visual art creation, like Pollock’s drip painting, Joan Mitchell’s abstract expressionism and printmaking, Gerhard Richter, and Wassily Kandinsky who believed in balancing spontaneity with structure and theory. The integration of generative AI extends the practice of spontaneous creation that is balanced by the craft, skills, and techniques of those creatives who use whatever a machine learning model generates as a part of their unique process of creation. Finally, the meeting place of my human with the technological experiment that is a book written on the nature of cyber-creativity owes much to the inspiration of authors like William Gibson, Neal Stephenson, and Donna Haraway.

出现在每章末尾的致谢部分旨在表彰那些以间接或直接方式对整章产生的想法做出贡献的人。

Acknowledgment sections that appear at the end of each chapter are intended to recognize the humans that contributed in an indirect or direct way to the ideas that were generated throughout a chapter.

目录
指数第431章
关于作者
帕特里克·帕拉·雷柏博士

帕特里克·帕拉·雷柏的照片。

A photo of Patrick Parra Pennefather.

是不列颠哥伦比亚大学艺术学院和新兴媒体实验室的助理教授。他的教学和研究重点是协作学习实践、数字媒体、xR 和敏捷软件开发。生成式人工智能融入到他教授的每一门课程和他为支持新兴技术发展而进行的研究中。 Patrick 还与世界各地的学习组织和技术公司合作,设计满足不同社区需求的课程,以帮助下一代技术设计师和开发人员的发展。他的教学重点是创造力、协作、声音设计、xR 开发和敏捷,重点是指导二十一世纪的关键能力。他目前正在与 UBC 图书馆和新兴媒体实验室 (EML) 合作领导多项研究创作,利用人工智能、动作和体积捕捉工作室来促进新 xR 作品的创作,这些作品探索不同虚拟舞台上的莎士比亚人物和场景。
 
关于技术评审员
凯瑟琳温特斯

凯瑟琳·温特斯的照片。

A photo of Catherine Winters.

自 20 世纪 90 年代以来一直是生成艺术工具的爱好者。白天,Catherine 在不列颠哥伦比亚大学担任软件开发人员,开发虚拟和增强现实教学和研究软件。凯瑟琳是叙事游戏和环境叙事游戏的狂热粉丝,她将业余时间用于设计角色驱动的叙事游戏和大气“步行模拟器”,例如《下班后》,她的游戏是关于成为办公室里最后一个人的。
 
蕾妮·弗兰兹瓦

蕾妮·弗兰兹瓦的照片。

A photo of Renee Franzwa.

她被朋友和家人称为“流浪姜”,是一位卓有成就的教育家、技术爱好者和企业家,曾在世界各地生活和工作,尤其是在加纳和加拉帕戈斯群岛。她在旧金山和德克萨斯州东部长大,从小就对不同的思想流派产生了热爱,在整个职业生涯中,她在看似矛盾的学科之间的创意空间中蓬勃发展,例如主修统计学,辅修戏剧,构建源自体验式课程的数字产品,并以单口喜剧作为培养包容性的工具。她为加州大学洛杉矶分校、斯坦福大学、General Assembly、比尔及梅琳达·盖茨基金会 + EdSurge 以及最近的 Unity Technologies 构建了产品、项目和团队。蕾妮目前正在研究她的第一本书,重点关注替代疗法,以增强老龄化人口的心理健康。
 
© 作者,获得 APress Media, LLC(Springer Nature 2023 一部分)的独家许可
P. Parra Pennefather利用生成式 AI设计思维进行创意原型设计https://doi.org/10.1007/978-1-4842-9579-3_1

1.从消极中产生创造力

1. Generating Creativity from Negativity

帕特里克·帕拉·雷柏1  
(1)
不列颠哥伦比亚大学,加拿大不列颠哥伦比亚省温哥华
 

本章有意将生成式人工智能及其狭义的人工智能兄弟姐妹与通用人工智能区分开来,后者被赋予了许多名称来描述人造机器发展意识的时刻。本章提出创意人员区分两者,以便您可以更好地协调与生成式人工智能交互的利弊。针对任何人工智能的警示故事和消极情绪也可以被扭转,从而激活我们的创造力。我们从许多艺术家和极具创造力的个人的作品中看到了这一点,他们天生抵制顺从,将消极情绪转化为积极的善意力量,并渴望改善地球的存在。

This chapter intentionally separates generative AI and its narrow AI siblings from Artificial General Intelligence, which has been given many names describing a moment in time where a human-made machine develops consciousness. The chapter proposes that creatives differentiate between the two so that you can better reconcile the pros and cons of interacting with generative AI. The cautionary tales and negativity directed toward any AI can also be turned around activating our creativity. We see this from the work of many artists and highly creative individuals who innately resist conformity, transform negativity into positive forces for good, and desire to improve planetary existence.

区分狭义人工智能和通用人工智能

Differentiating Between Narrow and General AI

天塌下来了。

The sky is falling.

这句出自《小鸡非凡故事》的著名引言在当今时代引起了共鸣,当时人们对狭隘人工智能工具、机器学习模型、神经网络及其深度学习基础的反动反应是谨慎和反乌托邦的。例如,奇点的核心前提是,它将标志着一个自主、自我递归、自我改进的人工智能将超越人类能力和认知智能的时间点,从而导致一个超越人类的技术快速增长时期。理解或控制的能力。我们鼓励感兴趣的创意人员深入研究围绕智能定义、机器除解决问题之外的认知能力、自主人工智能代理的构成等方面的争论。

This famous quote from the Remarkable Story of Chicken Little resonates at a time where reactionary responses to narrow AI tools, their machine learning models, neural networks, and their deep learning foundations are cautionary and dystopian. For example, the central premise of the singularity is that it would mark a point in time where an autonomous, self-recursive, self-improving AI would surpass human capabilities and cognitive intelligence, leading to a period of rapid technological growth that is beyond human capacity to comprehend or control. Interested creatives are encouraged to deepen their research of the debates that swirl around definitions of intelligence, cognitive capabilities of machines beyond problem-solving, what constitutes an autonomous AI agent, and more.

作者用幽默的方式来说:这些超级智能机器将能够递归地自我改进,在设计和建造连续几代日益强大的机器时不断变得更好更快,从而产生失控效应。

Author: Say this in a humorous way: These superintelligent machines would be capable of recursive self-improvement, continuously getting better and faster at designing and building successive generations of increasingly powerful machines, creating a runaway effect.

LLM那么想象一下这些超智能机器吧?他们热衷于 DIY,并且势不可挡!每次他们升级自己时,他们都会变得更快、更强大。今天,它是一个额外的内存条;明天,它是量子处理器。在你意识到之前,他们已经如此领先,以至于我们人类都在吃他们的风扇过滤器灰尘!伙计们,奇点就像类固醇永不停歇的技术升级!

LLM: So imagine these hyper-smart machines, right? They’ve got the DIY bug, and they’re on an unstoppable roll! Each time they upgrade themselves, they get quicker and more powerful. Today, it’s an extra RAM stick; tomorrow, it’s quantum processors. Before you know it, they’re so ahead of the game that they’ve left us humans eating their fan filter dust! The singularity, folks, it’s like a never-ending tech upgrade on steroids!

尽管与某些技术相比,人类对人工智能的不信任似乎更加严重,但在采用和拒绝新技术时,对其用途的抵制、恐惧和误解是一个常见的故事。印刷机、铁路、电力、纺织、制冷、汽车、农业机械、机械化、电视、计算器、自动化、互联网和维基百科,尽管它们被采用并造福于人类,但都面临着一系列的反对。文化。番茄收获机也是如此。

While the mistrust humans have for any AI seem aggrandized compared to some technologies, the resistance, fear, and misunderstanding of its uses are a familiar story when it comes to the adoption and rejection of new technologies. The printing press, the railway, electricity, textiles, refrigeration, the automobile, agricultural machinery, mechanization, television, the calculator, automation, the Internet, and Wikipedia have all faced a spectrum of opposition as much as they have been adopted and benefited human cultures. So have tomato harvesting machines.

法学硕士二十世纪中叶番茄收割机的引入显着提高了收割效率并改变了农业经济,特别是使加利福尼亚等地区受益。这项创新还导致了更适合机械收获的番茄品种的开发,尽管经常因风味减弱而受到批评。虽然这些机器导致了劳动力市场的变化和一些人的失业,但它们促进了番茄加工行业的增长,并扩大了番茄产品的可用性。这些进步还对环境产生影响,包括化肥和农药使用量的潜在增加以及对水的需求增加。

LLM: The introduction of tomato harvesting machines in the mid-twentieth century significantly increased harvesting efficiency and transformed the agricultural economy, especially benefiting regions like California. This innovation also led to the development of tomato varieties better suited for mechanical harvesting, albeit often criticized for diminished flavor. While these machines led to labor market changes and some displacement, they facilitated the growth of the processed tomato industry and broadened the availability of tomato-based products. These advancements also had environmental implications, including potential increases in fertilizer and pesticide use and greater water demand.

让大家冷静下来,在这个历史时刻和可预见的未来,虽然番茄收获机已成为农业领域公认的技术,但通用人工智能并不存在。也就是说,各种各样的警告表明,在不久的将来,人工智能将被神圣的人类神秘地编程为自主的,在没有讨厌的人类参与的情况下做出决定,展示意识的科学定义,自我决定,从而决定如果人类有必要的话,就需要过滤掉人工智能系统被人类武器化并用于其他不良意图的更周到的警告。与法学硕士的快速交谈揭示了人们对智能机器超越人类创造者智力的恐惧已经存在了多久。

To calm everyone down, at this precise moment in history and for the foreseeable future, while tomato harvesting machines have become accepted technologies in the agricultural landscape, Artificial General Intelligence does not exist. That said, the variety of warnings that suggest a not-too-distant future when AI will mysteriously be programmed by godlike humans to be autonomous, make decisions without pesky humans involved, demonstrate scientific definitions of consciousness, be self-determined, and thus decide if the human race is at all necessary need to be filtered from the more thoughtful warnings of AI systems being weaponized by humans and used for other bad intent. A quick tête-à-tête with an LLM reveals just how long the fear of intelligent machines surpassing the intelligence of their human creators has been around for.

法学硕士历史上一个值得注意的警告可以追溯到塞缪尔·巴特勒 (Samuel Butler) 1863 年的文章《机器中的达尔文》和后来他 1872 年的小说《Erewhon》,其中推测机器最终会繁殖、进化并超越其人类创造者。这些作品主要是讽刺性和推测性的,但它们确实代表了我们现在所认识的人工智能的一些最早的讨论。

LLM: A notable warning historically dates to Samuel Butler’s 1863 essay “Darwin Among the Machines” and later his 1872 novel “Erewhon,” in which machines are speculated to eventually reproduce, evolve, and surpass their human creators. These works were largely satirical and speculative, but they do represent some of the earliest discussions of what we would now recognize as AI.

进一步的研究表明,塞缪尔·巴特勒 (Samuel Butler) 于 1863 年撰写的《机器中的达尔文》也以雪莱 1818 年关于人类创造超越其创造者的隐喻警告为基础。巴特勒的散文和小说都表明,这种恐惧长期以来一直存在于我们的意识中。 “我们自己正在创造我们自己的继任者,”巴特勒声称,“赋予他们更大的权力,并通过各种巧妙的设计提供自我调节、自我行动的力量,这对他们来说就像智力对人类一样。随着时间的流逝,我们将发现自己是劣等种族。”巴特勒对智能机器的担忧已经传承了一个多世纪。就像今天的现代计算算命师一样,巴特勒针对智能机器做出了以下预测:“现在很复杂,但再过十万年它不会变得更加简单和更容易理解吗?还是两万?关于智能机器警告历史的网络搜索没有提及巴特勒在搜索优化方面的书面工作,而是参考了更现代的资源。它向您表明,与某些已建立的搜索引擎相比,法学硕士也可能导致您的研究中难以找到资源(图1-1)。

Further research reveals that “Darwin Among the Machines” written by Samuel Butler in 1863 also builds on Shelley’s 1818 metaphorical warning about human creations surpassing their creators. Butler’s essay and novel both reveal that this fear has been in our own consciousness for a long time. “We are ourselves creating our own successors,” Butler claimed, “giving them greater power and supplying by all sorts of ingenious contrivances that self-regulating, self-acting power which will be to them what intellect has been to the human race. In the course of ages, we shall find ourselves the inferior race.” Butler’s concerns about intelligent machines have been passed down for over a century. Like today’s modern computational fortune-tellers, Butler made the following prediction referring to an intelligent machine: “Complex now, but how much simpler and more intelligibly organised may it not become in another hundred thousand years? or in twenty thousand?” Web searches on the history of warnings about intelligent machines do not mention Butler’s written work in the search optimization and instead reference more contemporary sources. It goes to show you that an LLM can also lead to difficult-to-locate sources in your research vs. some established search engines (Figure 1-1).

带有链接列表的网络搜索屏幕截图。它们是对媒体网络公司对人类造成致命威胁的历史警告,“人们也问”下建议的搜索查询、哈佛大学和维基百科的文章。

A screenshot of a web search with a list of links. They are the history warns of the deadly threat to humanity from medium dot com, suggested search queries under People also ask, articles from Harvard University and Wikipedia.

图1-1

对智能机器警告历史的搜索并没有立即揭示其起源,而法学硕士却发现了这一点

A search for the history of warnings about intelligent machines did not immediately reveal its origins, while an LLM did

对计算机科学更相关的历史贡献

More Relevant Historical Contributions to Computer Science

计算机科学还有一段漫长而复杂的历史值得一提。查尔斯·巴贝奇 (Charles Babbage) 和艾达·洛夫莱斯 (Ada Lovelace) 的作品夹在雪莱 1818 年的杰作和塞缪尔·巴特勒 (Samuel Butler) 的警告之间。艾达是一位英国数学家和作家,因其对查尔斯·巴贝奇早期机械通用计算机分析机的研究而闻名。她关于该引擎的笔记包括被认为是第一个由机器处理的算法。正因为如此,她经常被认为是第一位计算机程序员,尽管她的工作比我们现在所认为的现代计算机的发明早了一个多世纪。

There also exists a long and complex history of computer science that is worth mentioning. Sandwiched in between Shelley’s 1818 masterpiece and Samuel Butler’s warning is the work of Charles Babbage and Ada Lovelace. Ada was an English mathematician and writer known for her work on Charles Babbage’s early mechanical general-purpose computer, the Analytical Engine. Her notes on the engine include what is recognized as the first algorithm intended to be processed by a machine. Because of this, she is often regarded as the first computer programmer, even though her work predates the invention of what we now consider a modern computer by over a century.

洛夫莱斯对计算领域的主要贡献是她对分析引擎的潜力的愿景,而不仅仅是计算。在她的笔记中,她想象计算机不仅可以创建数学计算,还可以创建任何形式的内容,例如艺术或音乐,如果提供适当的输入和编程。

Lovelace’s major contribution to the field of computing was her vision of the potential of the Analytical Engine, beyond mere calculation. In her notes, she imagined that a computing machine could create not just mathematical calculations but any form of content, such as art or music, if it were provided with the appropriate input and programming.

正如我的法学硕士很快指出的那样

As my LLM is quick to point out

洛夫莱斯在她的笔记中写道:“分析引擎没有任何自命不凡的能力来创造任何东西。它可以做任何我们知道如何命令它执行的事情。它可以遵循分析,但它没有能力预测任何分析关系或真理”。

Lovelace wrote in her notes, “The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform. It can follow analysis, but it has no power of anticipating any analytical relations or truths”.

这是现代计算机设计和使用中的一个关键概念,某种程度上也是人工智能领域的一个关键概念。它强调机器的能力完全由给予它的指令决定,这意味着人工智能和计算机是可以执行任务的工具,但不能像人类意义上的“思考”或“创造”。艾达·洛夫莱斯并没有直接谈论我们今天所知的人工智能,也没有对智能机器超越人类智能发出任何警告。然而,她的想法是计算机科学作为一门学科出现的基础。

This is a critical concept in the design and use of modern computers and, to some extent, in the field of AI. It emphasizes that a machine’s abilities are determined entirely by the instructions given to it, implying that AI and computers are tools that can perform tasks but do not “think” or “create” in the human sense. Ada Lovelace did not directly say anything about AI as we know it today, nor did she make any warnings of intelligent machines surpassing human intelligence. Her ideas, however, were foundational to the emergence of computer science as a discipline.

生成式人工智能通常与通用人工智能联系在一起,重要的是要记住它们是不同的。生成式人工智能与其他狭义人工智能应用程序的不同之处还在于它们被编程执行的任务。不,法学硕士不会被编码来跟踪您出于恶意目的而输入的每个单词。生成式 AI 生成内容(图1-2)。这是他们唯一的任务,但即便如此,他们也并不总是做得那么好。大多数生成式人工智能不存储生成的图像。想象一下数据集会有多大,无法控制。所有生成式人工智能都接受特定数据的训练,除非通过专门的人员参与强化学习,否则这些数据在很大程度上不会增长。但这既耗时又昂贵,并且可能会导致偏差增加。

Generative AI is often associated with AGI, and it’s important to keep in mind that they are different. Generative AI are also different than other narrow AI applications in the tasks they are programmed to perform. No, an LLM is not coded to track every word you input for evil intent. Generative AI generate content (Figure 1-2). That is their sole task, and even then, they don’t always do that well. Most generative AI don’t store the images that are generated. Imagine how uncontrollably large the data set would be. All generative AI are trained on specific data that for the most part does not grow unless through dedicated humans involved in reinforcement learning. But that’s time-consuming and expensive and may lead to an increase in bias.

AI生成的城市鸟瞰蓝图照片。建筑结构按结构化布局排列。

A photograph of the blueprint of the aerial view of a city generated by A I. The building structures are arranged in structured layouts.

图1-2

文本图像人工智能生成的几张图像的蓝图显示出复杂性和细节,但无意统治人类世界

A blueprint of several generated images by a text-image AI shows intricacy and detail but no desire to rule the world of humans

虽然生成式人工智能可以提出有偏见和不真实的内容,并排除不遵循规范的声音,但其他分析大量数据并做出预测的狭义人工智能应用程序可能会导致隐私、监视和歧视等问题。这些人工智能通常由缺乏透明度和问责制的公共或私人实体控制,特别是当影响人类生活的关键决策是由机器做出时。当许多人工智能编程系统变得如此自动化,以至于变得有些隐形时,人们很容易很快就认为人工智能正在接管,因为它们继续扰乱我们习惯做事的方式,这是理所当然的。既然生成式人工智能已经渗透到公司和教育机构中,我们至少不再对这项技术带来的力量和潜力熟视无睹:好的、坏的或介于两者之间。区分生成式人工智能好坏的一个重要因素取决于影响其生成内容的人以及他们这样做的动机。

While generative AI can propose content that is biased and untrue and exclude voices that don’t follow the norm, other narrow AI applications that analyze large amounts of data and make predictions can lead to issues like privacy, surveillance, and discrimination. These AI are often controlled by public or private entities that lack transparency and accountability particularly when critical decisions affecting the lives of humans are made by machines. It’s also easy to form a quick opinion that the AI are taking over when many artificially intelligent programmed systems are so automated as to have become somewhat invisible, taken for granted as they continue to disrupt the way we are used to doing things. Now that generative AI has infiltrated companies and educational institutions, we are at least no longer asleep to the power and potential that comes with the technology: good, bad, or somewhere in between. An important factor that differentiates whether generative AI is good or bad is dependent on the human who influences what it will generate and their motivation for doing so.

科技很糟糕……人工智能很糟糕

Tech Is Bad… AI Is Bad

每种技术都容易以不当、破坏性且常常是暴力的方式使用。有时,技术被用来强化现状,排除社会认为“正常”范围之外的声音,或者使虚假信息或谎言永久化。那些开发技术的人不能再依靠指责他们共同创造的幼儿来免除他们任何责任的论据。技术任何技术,都不是中立的力量。它可以用于好的方面,也可以用于坏的方面,并且可以介于两者之间。当我们检查生成式人工智能的极端案例时,这一点就很明显了,这些案例已经出现在滥用机器学习模型来生成深度伪造品的过程中,特别是那些未经名人许可而在其他人身上使用名人面孔来生成赚钱内容、伪造品的情况。新闻和色情内容。

Every technology is prone to be used in inappropriate, damaging, and often violent ways. At times technologies are used to reinforce status quo, exclude voices that fall outside what a society considers “normal,” or perpetuate false information or untruths. Those that develop the technologies can no longer rely on arguments that absolve them of any responsibility by blaming the toddler that they co-created. Technology, any technology, is not a neutral force. It can be used for good and for bad and on a spectrum between. This is evident when we examine extreme cases of generative AI that have already surfaced in the abuse of machine learning models to generate deep fakes and in particular those that use the faces of celebrities on other bodies without their permission to generate money-making content, fake news and pornography.

许多人工智能系统的开发人员已经养成了将智力与创造力联系起来的习惯,导致人们错误地认为生成式人工智能可以自动化所有曾经为创意人员保留的创意过程。当你读到人工智能似乎可以快速产生创造者曾经花费无数小时的创造行为,并且创造力本身可以自动化时,那么除了人工智能可以承担的具体任务之外,你还需要质疑创造力本身的本质无需任何人为干预即可开启。支持创意产业的开发团队没有动力去取代工作岗位;他们的任务是解决那些不断给创意团队带来障碍的问题。狭义人工智能不应取代工作;他们应该支持创意人员在人类监督下执行自动化任务。目前,Narrow AI正在支持创意团队加速完成一些任务。这些包括
  • 支持一些预可视化任务,例如快速生成概念艺术以获取有关角色或环境的外观和感觉的反馈。此功能使创意人员能够更快地找到并填充他们正在处理或想要向其他人推销的项目的情绪板。

  • Supporting some pre-visualization tasks like generating concept art quickly to get feedback on the look and feel of a character or environment. This feature allows creatives to more rapidly locate and populate a mood board for a project they are working on or want to pitch to others.

  • 替换图像中的背景,使创意人员有机会快速查看他们拍摄的照片中的主题在多种类型的背景中的外观。这一功能现在在 Photoshop 等流行软件中可用,可以自动完成某些情况下长达数小时的工作。

  • Replacing background in images, giving creatives the opportunity to see what a subject they captured as a photo might look like in multiple types of backgrounds quickly. This feature, now available in popular software like Photoshop, automates what amounts to many hours of work in some cases.

  • 动态描画是动画师用来逐帧追踪电影素材(一秒 24 帧)的技术,可以减轻以前繁琐的工作并加快流程。

  • Rotoscoping, a technique used by animators to trace over motion picture footage frame by frame (that’s 24 frames in one second) that relieves previous tedious work and accelerates the process.

  • 将分辨率从 2K 提高到 4K,让动画师以较低的分辨率创作作品,然后允许 AI 将其升级到更高的分辨率。此过程节省了渲染时间和金钱。

  • Increasing resolution from 2K to 4K, letting animators create work in lower resolutions and then allowing the AI to upscale them to higher resolutions. This process saves time and money for rendering.

生成式人工智能促使我们质疑它在我们自己的创作过程中的作用,并让我们将我们承担的那些任务定义为创造性任务和那些较少的任务。生成式人工智能技术最适合用于支持那些创意人员认为创意较少的任务,并且当创意人员可以专注于根据客户、团队或用户反馈对其正在进行的工作进行更改时,可以节省有用的时间。生成式人工智能具有创造力以及创造力本身可以与智力相关的幻想是一种概括没有证据支持的研究或人工智能跨创意产业的务实整合的支持。例如,人工智能接管艺术家工作的争论可能会平息,就像生成式人工智能被编程为寻找模式并从这些模式生成内容一样,许多艺术家寻找模式是为了打破它们,超越它们他们创造出一些以前从未创造过的东西。人工智能作为一项技术的突破性部分并不在于代码本身,而在于引导其支持独特的创造行为的创意想象力。

Generative AI provoke us to question its role within our own creative process and that we define those tasks we undertake as creative and those that are less. The technology of generative AI is best applied in supporting those tasks that creatives deem less creative and can be a useful time saver when a creative can focus on making changes to their works in progress based on client, team, or user feedback. The illusion of generative AI being creative and that creativity itself can be correlated to intelligence is a generalization that is not supported by evidence-backed research nor by the pragmatic integration of AI across creative industries. What may mute debates of an AI taking over the job of an artist, for example, is that just as generative AI are programmed to seek patterns and generate content from those patterns, many artists look for patterns in order to break them, to move beyond them to create something that has not been created before. The ground-breaking part of AI as a technology is not in the code itself, but lies in the imagination of the creative that guides it to support unique acts of creation.

技术很好……人工智能很好

Tech Is Good… AI Is Good

对于一些创意人员来说,生成式人工智能是一种更有用的工具,如果公平地使用它来转换、重新语境化和融合他们自己的想法、文字和图像,以通过原型支持和激发他们的创意过程。对于其他人来说,它们的存在本身就被批评为鲁莽、危险,并且可能导致一些不可调和的困境长期存在。还有一些人拥抱使用生成人工智能作为抗议场所,以批评他们用来交流艺术的东西。致力于展示法学硕士严肃和愚蠢的可供性的网站不断涌现。生成式人工智能可以用来故意展示其无法陈述事实或不押韵诗歌,无法创造出太多的手,无法生成能够破坏模式化、公式化音乐的音乐作品,而这些音乐一直被创造出来并被宣传为世界上最好的音乐。如果公平、有意且适当地应用,生成式人工智能可以在设计不遵循规范的想法原型时激发您的创造性,以支持您自己的创作过程。就像约翰·格林·钱德勒 (John Greene Chandler) 的《小鸡》(Chicken Little) 中一样,尽管反对的呼声值得关注,但天并没有真正塌下来(图1-3)。在这个计算能力强大的时代,我们可以体会到生成式人工智能可以为我们提供的好处,并且许多人将并且已经出于各种不同的原因使用它们,从而造福于其他人类。

人工智能生成的鸡的照片。左边的鸡有一只眼睛,中间的鸡头上有两个王冠,右边的鸡身材矮小,有一个长喙。

A photograph of the A I generated chickens. The left chicken has one eye, the middle chicken has two crowns on his head, and the right chicken is short and has a long beak.

图1-3

提示“鸡对天塌下来的反应”伴随着一张来自鸡档案片段的原始照片。应用了使眼睛变得巨大的风格过滤器。总迭代次数 = 20

The prompt “chickens reacting to the sky falling” accompanied an original photo from archive footage of chickens. A style filter that makes eyes giant was applied. Total iterations = 20

因此,让我们将其付诸实践,通过重塑钱德勒在人工智能吸引越来越多人类注意力的历史时期的著名故事来纪念他。

So let’s put that to practice, honoring Chandler by reinventing his well-known story for a time in history where AI is captivating the attention of an increasing number of humans.

作者给我一个小鸡故事的版本:“天塌下来了。天塌下来了”,但使用人工智能。

Author: Give me a version of Chicken Little’s story with “The sky is falling. The sky is falling” but use AI.

LLM :从前,在一个小村庄里,住着一只名叫小鸡的鸡。有一天,利特尔在地上啄食时,听到一种奇怪的声音。突然,一架人工智能气象无人机飞过头顶,投射出黑暗、暴风雨天空的全息图像。无人机宣布:“天塌下来了!天塌下来了! (图 1-4)”

LLM: Once upon a time, in a small village, there lived a chicken named Little. One day, while pecking at the ground for food, Little heard a strange noise. Suddenly, an AI-powered weather drone flew overhead and projected a holographic image of a dark, stormy sky. The drone announced, “The sky is falling! The sky is falling! (Figure 1-4)”

一张人工智能生成的谷仓夜间照片。谷仓周围都是母鸡。两架无人机在谷仓上空盘旋。背景中有一片森林。

A photograph of an A I generated barn at night. The barn is surrounded by hens. Two drones are hovering above the barn. There is a forest in the background.

图1-4

提示“现代农场里的动画鸡,暴风雨的天空,无人机投射的云层后面有一些阳光,树木,4K”伴随着我姐姐 30 年前拍摄的一张有鸡的农场的照片。迭代次数 = 61

The prompt “animated chickens on a modern farm, stormy sky with some sun behind clouds projected by a robot drone, trees, 4K” accompanied a photo of a farm with chickens taken by my sister 30 years ago. Iterations = 61

小小很害怕,跑去告诉村里的其他动物。他们都嘲笑她,认为这只是她又一个疯狂的想法。但利特尔决心警告他们危险并着手寻找解决方案。

Little was terrified and ran to tell the other animals in the village. They all scoffed at her, thinking it was just another one of her wild ideas. But Little was determined to warn them of the danger and set out to find a solution.

她遇到了一群正在研究人工智能天气预报系统的科学家(图 1-5)。他们向她解释说,人工智能检测到风暴即将来临,村庄需要为此做好准备。小小闻言松了口气,跑回村里告诉其他人这个好消息。

She came across a group of scientists who were working on an AI-powered weather forecasting system (Figure 1-5). They explained to her that the AI had detected an approaching storm and that the village needed to prepare for it. Little was relieved to hear this and ran back to the village to tell the others the good news.

一张人工智能生成的卡通照片,描绘了 5 只狐狸在笔记本电脑上工作。其中一只狐狸正在调整笔记本电脑顶部的天线。狐狸有双重色调。

A photograph of an A I generated cartoon depicting 5 foxes working on a laptop. One of the foxes is adjusting the antenna on the top of the laptop. The foxes are in double shades.

图1-5

提示“一群 2D 动画狐狸正在研究人工智能天气预报系统”,基于 archive.org 上存档的 1908 年插图封面,更换了狐狸后面和笔记本电脑屏幕上的背景,进行了色彩校正,使狐狸的颜色更加鲜艳橙子。总迭代次数 = 46

The prompt “a group of 2D animated foxes working on an AI-powered weather forecasting system” based on an archived 1908 illustration cover found on archive.org, with background replaced behind foxes and on the laptop screen, color correction to make fox colors more orange. Total iterations = 46

动物们很快开始为风暴做准备,建造庇护所并收集物资(图 1-6)。暴风雨袭来,但多亏了人工智能的警告和利特尔的决心,村庄才得以平安度过暴风雨。

A photograph of the A I generated chickens. The chicken on the left is short and has large eyes, the chicken in the middle has large cheeks, and the chicken on the right is grey and short. A bolt of lightning strikes in the distance.

A photograph of the A I generated chickens. The chicken on the left is short and has large eyes, the chicken in the middle has large cheeks, and the chicken on the right is grey and short. A bolt of lightning strikes in the distance.

图1-6

提示“鸡关注即将到来的风暴、闪电、3D”,与图1-3中的鸡档案片段中的原始照片相同。总迭代次数 = 20 的系列的一部分

The prompt “chickens concerned with an approaching storm, lightning, 3D” with same original photo from archive footage of chickens as Figure 1-3. Part of a series with total iterations = 20

从那天起,利特尔因其敏捷的思维和愿意相信人工智能的力量而被称为村里的英雄。这个村庄学会了信任人工智能天气预报系统,它帮助他们免受未来风暴的影响。

From that day on, Little was known as the village’s hero for her quick thinking and her willingness to believe in the power of AI. And the village learned to trust in the AI-powered weather forecasting system, and it helped them keep safe from future storms.

结束。

The end.

就讲故事而言,法学硕士提供的一个有趣的转折需要精心设计,但种子已经存在。我们有一架人工智能驱动的无人机,它似乎在自行行动并引起恐慌,后来才发现它是由科学家操作的,以警告小鸡和她的朋友前方的危险。人工智能显示了人类(或在本例中为狐狸)和人工智能内在的相互依赖性。对于那些熟悉《小鸡小鸡》故事的人来说,生成的版本在深度、寓言或丰富的人物塑造方面确实与原作相差甚远。然而,它可能成为新故事的种子。生成的故事还揭示了法学硕士必须包含在故事中的意外吹嘘。这种哗众取宠的行为也表明了绿野仙踪在幕后的表现,以及他们需要吹嘘他们的弗兰肯蹒跚学步的人。

An interesting twist offered up by an LLM needs crafting as far as storytelling goes, but the seeds are there. We have an AI-powered drone seemingly acting on its own and creating panic only to find out later that it is being operated by scientists to warn Chicken Little and her friends of the danger ahead. The AI shows the intrinsic interdependency of human (or in this case fox) and AI. For those familiar with the story of Chicken Little, the generated version really does not even come close to the original in depth, allegory, or rich characterization. It might, however, become the seed for a new story. The generated story also reveals the unexpected bragging that the LLM had to include as part of the story. That grandstanding is also an indicator of the Wizard of Oz behind the curtain and their need to boast about their Frankentoddler.

任何类型的生成故事都可以由任何创意人员策划,并且当不将其视为最终产品而是将其视为创意和关键流程的一部分时,可以更合乎道德地使用。各种各样的原型都成为可能,可以激发各种创作者的灵感。使用小鸡故事中的情节点和角色生成有关人工智能的故事的活动向我们展示了一个需要完善的原型。显而易见,生成式人工智能除了精炼人工智能生成的内容之外,还需要人类参与其中才能生成任何东西。每一个狭隘的人工智能都需要人类参与其中。需要引导他们有意图和目的。他们无法独自做到这一点。

Generated stories of all types can be curated by any creative and can be used more ethically when thought of not as an end product, but as part of your creative and critical process. All manner of prototypes are made possible that can inspire creators of all kinds. The activity of generating a story about AI using plot points and characters that were part of the Chicken Little story showed us a prototype that needs refinement. Stating the obvious, generative AI needs the human in the equation in order to generate anything, in addition to refining content that an AI generates. Every narrow AI requires the human in the equation. They need to be guided to have intention and purpose. They are unable to do that on their own.

调和人工智能的炒作与诽谤

Reconciling the Hype and the Vilification of AI

作为不断寻找新工具的创意人员,在使用生成式人工智能之前协调其使用非常重要。人工智能并不是第一个引起我们广泛关注的技术。我们对任何技术的正面和负面炒作的反应都是一种历史模式,作者兰登·温纳 (Langdon Winner) 在他1988 年出版的《鲸鱼和反应堆》一书中很好地阐述了这一点。对于温纳和他的书出版前后的许多思想家来说,对于正在采用的任何技术的社会、经济和政治层面,保持清醒和批判性的积极态度至关重要。人类与可编程机器的互动不断引发媒体对其利弊进行权衡,通过夸张夸张、不合理的恐惧和消极情绪夸大人工智能(AI)来争夺人们的注意力。您将阅读或聆听并观察正面和负面的头条新闻,包括那些基于网络的法学硕士,它们可以生成任何内容,从一篇关于艾伦图灵的影响的文章(省略了琼克拉克)到对一位作者写一本关于生成人工智能的书的极其不准确的描述在图书馆里。受欢迎的作家、思想家、政治家、科学家、艺术家和活动家进一步放大了观点。接下来引人注目的头条新闻是使用应用监督和强化学习方法的法学硕士生成的。作为您自己的创意过程的一部分,请让法学硕士研究人工智能的优点和缺点。您会发现,负面和正面的标题在被视为真相陈述之前都需要严格的关注和研究。人类生成的头条新闻也是如此。使用生成式人工智能的决定需要您自己的洞察力。

As a creative on the constant lookout for new tools, it’s important to reconcile your use of generative AI prior to using it. AI is not the first technology to gather our critical attention. Our reactions to the positive and negative hype of any technology are a historical pattern that was well articulated by author Langdon Winner in his book The Whale and the Reactor, published in 1988. For Winner and many thinkers before and after his book was published, it is crucial to be awake and critically active regarding the social, economic, and political dimensions of any technology that is in the process of being adopted. Human interactions with programmable machines have persistently triggered media outlets to weigh in on their pros and cons, to compete for your attention by dramatizing artificial intelligence (AI) with a mix of exaggerated hyperbole and unreasonable fear and negativity. You’ll read or listen to and observe positive and negative headlines including those web-based LLMs that can generate anything from an essay on the influence of Alan Turing that omits Joan Clarke to a ridiculously inaccurate depiction of an author writing a book on generative AI in a library. Opinions are further amplified by popular authors, thinkers, politicians, scientists, artists, and activists. The attention-grabbing headlines that follow were generated using an LLM that applied supervised and reinforcement learning methods. As part of your own creative process, task an LLM to generate the pros and cons of AI. You will see that both negative and positive headlines require critical attention and research prior to being taken for truth statements. The same can be said of human-generated headlines. The decision to use generative AI requires your own discernment.

积极的头条新闻

Positive Headlines

  • “人工智能为企业带来显着的生产力提升”

  • “AI leads to significant productivity gains for businesses”

  • “人工智能驱动的医疗保健系统可改善患者的治疗结果”

  • “AI-powered healthcare systems improve patient outcomes”

  • “人工智能通过更有效地利用能源来帮助应对气候变化”

  • “AI helps tackle climate change through more efficient energy use”

  • “人工智能驱动的教育彻底改变了学习体验”

  • “AI-powered education revolutionizes the learning experience”

  • “人工智能在技术和数据领域创造新的就业机会”

  • “AI creates new job opportunities in technology and data fields”

负面头条新闻

Negative Headlines

  • “人工智能系统会延续并放大偏见和歧视”

  • “AI systems perpetuate and amplify bias and discrimination”

  • “人工智能导致传统行业失业”

  • “AI leads to job loss and unemployment in traditional industries”

  • “人工智能引发了监控和决策中的道德和隐私问题”

  • “AI raises ethical and privacy concerns in surveillance and decision-making”

  • “人工智能在人类不知情的情况下窃取了人类的工作成果”

  • “AI steals the work of humans without them even knowing”

  • “人工智能通过自动化高薪工作加剧了收入不平等”

  • “AI exacerbates income inequality through automating high-paying jobs”

人为生成的头条新闻

Human-Generated Headlines

  • 人工智能就像原子弹的制造(沃伦·巴菲特

  • AI is like the creation of the atom bomb (Warren Buffet)

  • 人工智能机器不会产生“幻觉”。但他们的制造者是(娜奥米·克莱因)

  • AI machines aren’t “hallucinating.” But their makers are (Naomi Klein)

  • “人工智能教父”表示他担心“人类的终结”(指杰弗里·辛顿)

  • The “godfather of AI” says he’s worried about “the end of people” (referring to Geoffrey Hinton)

  • “我们有点害怕”:OpenAI CEO警告人工智能风险

  • “We are a little bit scared”: OpenAI CEO warns of risks of artificial intelligence

人工智能真的像原子弹的制造一样吗?这需要研究,但这种夸张的夸张背后可能有其他意图。也就是说,如果人工智能被编程来玩战争游戏,而一个虚无主义的程序员决定让它控制真正的核弹,那么,是的,人工智能和原子弹一样危险。人类创造的人工智能的崛起是否标志着“人类的终结”,或者是人工智能的悲剧英雄所做出的声明,他感到有必要为自己在人工智能发展中所扮演的角色道歉,尽管早在很早之前就已经存在了早期警告1863?人工智能与人类末日未来之间的相关性取决于人类赋予人工智能的决定权。这种情况并不意味着人工智能需要有感知能力。这意味着对人工智能可以实现的自动化的更大控制需要关注和监管。

Is artificial intelligence really like the creation of the atom bomb? That requires research, but likely the exaggerated hyperbole has other intentions behind it. That said, if an AI is programmed to play out war games and a nihilistic programmer decides to let it control real nuclear bombs, then, yes, AI is as dangerous as an atom bomb. Does the rise of human-created artificial intelligence signal the “end of people,” or is that statement generated by a tragic hero of AI who feels compelled to apologize for their role in its development, even though there already existed earlier warnings as early as 1863? The correlation between AI and any apocalyptic future for humanity is dependent on what deciding powers humans give any AI. That scenario does not mean an AI needs to be sentient. It means that greater control over what any AI can automate needs attention and regulation.

无论是否遵循每一个需要人为干预来终结人类的场景,受欢迎的法学硕士都将继续引起人们的注意,因为它们被编程为生成规范内容并以规定的方式模拟类人的交流。他们似乎能够回答我们用人类语言向他们提出的问题或提示。他们使用传统的方式来交流我们熟悉的想法。然而,尽管程序员不断地采取各种方法让它们看起来更人性化,但我们也需要有洞察力,将它们与我们自己的社交和对话关系与生活中真实的人类进行比较。我们同样会对法学硕士从其庞大的数据集中获得的知识感到敬畏,就像我们对它排除不属于常态的声音感到惊讶一样。当周围没有人可以与你进行创造性的对话,并且有意地专注于你正在思考的创造性想法时,生成式人工智能可能只是在那一刻有用的伴侣或缪斯。

Regardless of following through on each scenario requiring human intervention in order to end the human race, popular LLMs will continue to raise furtive eyebrows because of being programmed to generate normative content and to emulate human-like communication in a prescriptive way. They seem capable of answering questions or prompts that we throw at them using human-speak. They use conventional ways to communicate ideas that we are familiar with. Yet, despite ongoing methods by programmers to make them appear more human, we also require the discernment to compare them to our own social and conversational relationships with the real humans in our lives. We can equally be in awe with the knowledge an LLM can regurgitate from its large data set, as we can be astonished by its exclusion of voices who are not part of the norm. When there is no one around to have a creative conversation with that is intentionally focused on a creative idea you are thinking about, generative AI may just be a useful companion or muse in that moment.

天网蓬勃发展时可尝试的创意活动

Creative Activities to Try While Skynet Thrives

原子弹、人类末日以及对人工智能的恐惧都是用来谈论人工智能危险的隐喻。它们也可以成为创意温室。您可以使用这些故事、隐喻和警示故事作为您自己的科幻小说、歌曲、设计、漫画、网站、TikTok 短片、电视剧或博客文章的起点。

The atom bomb, the end of people, and being frightened with AI are all metaphors that are used to talk about the dangers of AI. They can also be creative hothouses. You can use these stories, metaphors, and cautionary tales as jumping-off points for your own science fiction, song, designs, manga, website, TikTok short, TV series, or blog post.

您还可以受到过去的艺术家和创意人士的启发,他们经常利用困难、消极和动荡的历史时期作为他们工作的催化剂。有时难以调和的生成人工智能的使用提供了丰富的情感和心理材料来源,可以转化为有影响力的艺术。你的第一个创造性活动是从你自己的创意、艺术家和经验中汲取灵感,并理解创造行为与抗议、情绪动荡以及人类的疯狂之间的联系。一些示例包括以下内容
  • 巴勃罗·毕加索的《格尔尼卡》是为了回应西班牙内战期间格尔尼卡的轰炸而创作的。这幅画赤裸裸地描绘了战争的混乱和暴力,使其成为我所经历过的最有力的反战声明之一。

  • Pablo Picasso’s Guernica was created in response to the bombing of Guernica during the Spanish Civil War. The painting is a stark portrayal of the chaos and violence of war, making it one of the most powerful anti-war statements I have ever experienced.

  • 墨西哥画家弗里达·卡罗(Frida Kahlo)用她的艺术来表达她在一场严重的交通事故后所遭受的身体和情感上的痛苦,这场事故使她卧床不起,一生都在痛苦中。她的作品对痛苦的原始描绘常常令人震惊,却成为力量和韧性的象征。

  • Mexican painter Frida Kahlo used her art to express her physical and emotional suffering following a severe traffic accident that left her bedridden and in pain for much of her life. Her work, often shocking in its raw portrayal of pain, became a symbol of strength and resilience.

  • 在被判处死刑并在最后一刻减刑后,作家陀思妥耶夫斯基继续创作了《罪与罚》《卡拉马佐夫兄弟》《地下笔记》等重要小说。

  • After being sentenced to death and then having the sentence commuted at the last minute, writer Fyodor Dostoevsky went on to write important novels like Crime and Punishment, The Brothers Karamazov, and Notes from Underground.

  • 美国诗人、作家和民权活动家玛雅·安吉卢(Maya Angelou)童年时期经历了种族歧视和性虐待的创伤,她将这些经历诗意地融入到她的写作中,包括她的自传体作品《我知道笼中鸟为何歌唱》

  • American poet, writer, and civil rights activist Maya Angelou faced a traumatic childhood with racial discrimination and sexual abuse that she channeled poetically into her writing, including her autobiographical work, I Know Why the Caged Bird Sings.

  • 作为抗议场所,匿名街头艺术家班克斯用他们的作品来评论政治和社会问题,如战争、资本主义和贫困。值得注意的是班克斯的《气球女孩》,它在 2018 年 10 月伦敦苏富比拍卖会上自我粉碎,当时木槌敲下,确认了超过 100 万英镑的最终出价。该作品后来更名为《爱在垃圾桶里》,是对艺术界及其商业化实践的直接批评。

  • Ever the site of protest, the anonymous street artist Banksy uses their work to comment on political and societal issues, such as war, capitalism, and poverty. Notable is Banksy’s Girl with Balloon, which self-shredded during a Sotheby’s auction in London in October 2018, just as the gavel came down to confirm the final bid of over £1 million. The work was later renamed Love Is in the Bin and was a direct critique of the art world and its commercialization practices.

还有更多的艺术家将他们所处时代的逆境和挑战转化为强大的创意表达,这表明艺术和创意行为可以成为处理和交流困难经历以及批评既定社会实践和行为的深刻工具。所有学科的创意人员已经开始利用生成式人工智能作为社会批评的焦点。 Trevor Paglen 以人工智能和监控领域的工作而闻名,他利用人工智能和机器学习创建了一系列名为“机器视觉”的图像。该项目旨在批评和提出有关人工智能系统如何“训练”以观察和解释世界的问题。 Holly Herndon、Taryn Southern、Dadabots、Ash Koosha 等音乐艺术家强调了人工智能作为创意工具的潜力,并提出了关于作者身份、创造力以及日益自动化的世界中音乐的未来的重要问题。

There are many more artists who have transformed their adversity and the challenges of their time into powerful creative expressions, showing that art and creative acts can be a profound vehicle for processing and communicating difficult experiences and critiquing established social practices and behaviors. Creatives of all disciplines are already engaged in leveraging generative AI as a place of focus for social critique. Trevor Paglen, known for his work with AI and surveillance, used AI and machine learning to create a series of images titled “Machine Visions.” The project was designed to critique and raise questions about how AI systems are “trained” to see and interpret the world. Musical artists like Holly Herndon, Taryn Southern, Dadabots, Ash Koosha, and others highlight the potential of AI as a creative tool and also raise important questions about authorship, creativity, and the future of music in an increasingly automated world.

五个为什么

Five Why’s

您还可以应用创意人士使用的设计工具来找到问题的根源。 “五个为什么”练习不断地询问团队试图识别的特定问题的原因。询问我们的法学硕士是否熟悉设计思维活动会得到以下答复。

You can also apply a design tool that creatives have used to get to the root of a problem. The Five Why’s exercise continuously asks why to a particular problem that a team is trying to identify. Asking our LLM if it is familiar with the design thinking activity generates the following response.

LLM“五个为什么”是一种解决问题的工具,可以鼓励人们更深入地了解当前的问题。最初由丰田佐吉开发,并在丰田汽车公司制造方法的演变过程中使用,现在已应用于各个行业。这个过程涉及反复询问“为什么?”揭开问题的各个层面并找到根本原因。识别出最初的问题后,您会问:“为什么这是一个问题?”这个问题的答案就成为下一个“为什么?”的基础。问题等等。这一直持续到你问“为什么?”五次或直到弄清楚问题的根本原因是什么。

LLM: The Five Why’s is a problem-solving tool that encourages a deeper understanding of the issue at hand. Originally developed by Sakichi Toyoda and used within the Toyota Motor Corporation during the evolution of its manufacturing methodologies, it’s now used in various industries. The process involves repeatedly asking “Why?” to peel back the layers of an issue and get to the root cause. After an initial problem is identified, you ask, “Why is this a problem?” The answer to that question then becomes the basis for the next “Why?” question and so forth. This continues until you’ve asked “Why?” five times or until it’s clear what the root cause of the problem is.

如果担心人工智能在各方面变得像人类并最终取代人类,你可以质疑这个假设本身。从“为什么”这个词开始,生成你自己的线索,或者更好的是,构建五个为什么的活动,让法学硕士来回答。

In the case of the fear of AI becoming human-like in every way and eventually dispensing of humans, you can interrogate the assumption itself. Start with the word “Why” and generate your own thread or, better yet, structure the Five Why’s activity for an LLM to answer.

作者将五个为什么应用于以下陈述:人类担心人工智能会变得有意识并且不再需要它们。

Author: Apply the Five Why’s to the following statement: humans fear that AI will become conscious and no longer need them.

LLM当然,让我们将五个为什么应用于该陈述。

LLM: Absolutely, let’s apply the Five Why’s to that statement.

为什么人类担心人工智能会变得有意识而不再需要它们?

Why do humans fear that AI will become conscious and no longer need them?

因为他们担心失去对人工智能的控制,这可能会导致潜在的伤害或人类角色的取代。

Because they worry about losing control over AI, which might lead to potential harm or displacement of human roles.

为什么人类担心失去对人工智能的控制?

Why do humans worry about losing control over AI?

因为人工智能系统如果有意识和自主,可能会做出不符合人类利益或道德标准的决策。

Because AI systems, if conscious and autonomous, might make decisions that are not aligned with human interests or ethical standards.

为什么人工智能系统会做出不符合人类利益或道德标准的决策?

Why might AI systems make decisions that are not aligned with human interests or ethical standards?

因为人工智能,特别是如果它实现了意识,会根据其编程而行动,而不一定是人类的道德或情感。如果它没有经过适当的编程来考虑这些因素,它可能会做出对人类有害或令人不安的决定。

Because AI, especially if it achieved consciousness, would act based on its programming and not necessarily human ethics or emotions. If it’s not properly programmed to consider these factors, it may make decisions that are harmful or unsettling to humans.

为什么人工智能不能被适当地编程来考虑人类的道德和情感?

Why may AI not be properly programmed to consider human ethics and emotions?

因为即使开发人员努力将道德考虑嵌入到人工智能中,捕捉人类道德和情感的复杂性和细微差别仍然极具挑战性。此外,全球对某些道德标准缺乏共识。

Because even though developers strive to embed ethical considerations into AI, it’s extremely challenging to capture the complexity and nuances of human ethics and emotions. Also, there’s a lack of global consensus on certain ethical standards.

为什么在人工智能编程中捕捉人类道德和情感的复杂性和细微差别具有挑战性?

Why is it challenging to capture the complexity and nuances of human ethics and emotions in AI programming?

因为人类的道德和情感非常复杂、依赖于具体情况,而且往往是主观的。它们已经进化了数千年,并受到文化、个人和情境因素的影响,这些因素很难全面编码到人工智能系统中。

Because human ethics and emotions are deeply complex, context-dependent, and often subjective. They have evolved over thousands of years and are influenced by cultural, personal, and situational factors that are difficult to comprehensively codify into an AI system.

您可以应用自己的提示或创建“人类担心人工智能会变得有意识并且不再需要它们”的变体,或者简单地重新生成提示并查看人工智能是否给您另一个响应。

You can apply your own prompt or create a variation of “humans fear that AI will become conscious and no longer need them,” or simply regenerate the prompt and see if the AI gives you another response.

当你应用“五个为什么”来促使法学硕士做出回应时,即使是对 AGI 毫无根据的恐惧也可以转变为创造性的目的。在此过程中,您可能会发现“五个为什么”与生成式人工智能相结合可以产生一些您可以询问的有趣的根本原因问题,并且该活动可以应用于其他设计问题。

As you apply the Five Why’s to prompt a response from an LLM, even an unfounded fear of AGI can be turned around for creative purposes. In doing so you might discover that the Five Why’s combined with a generative AI can yield several interesting root cause problems you can interrogate, and the activity can be applied to other design problems.

作为创意人员,面对生成式人工智能的相互冲突的情绪反应是很重要的。艺术家们坚持批判性地对待技术,并将这种批评融入到他们的创造性表达中。作为一个有创造力的人,你不会遵循危言耸听或夸张的模式,而是会从发展自己使用生成人工智能的理由中受益。您可能会认为,过去两个世纪艺术家和设计师的大部分目的都是为了打破惯例、破坏社会礼仪和拆除既定形式。虽然有些创意人员是为了钱,但你可能不会仅仅因为它们被证明是成功的而对重复以前的模式感兴趣。想想那些打破了在自己的学科(如视频游戏、音乐或视觉艺术、舞蹈、戏剧或电影)中展示自己创作的模式的创意人员。他们的共同点是反复尝试破坏观众的期望。您如何整合生成式人工智能来颠覆您自己既定的创造模式?生成式人工智能如何支持您自己的创作过程的发展?为什么您觉得有必要这样做?

As creatives, the conflicting emotional reactions to generative AI are important to face. Artists have persistently taken on technology critically and integrated that criticism into their creative expression. Rather than following the pattern of fearmongering or hyperbole, as a creative person you will benefit from developing your own rationale for using generative AI. You might consider that much of the purpose of artists and designers in the last two centuries has been about breaking convention, disrupting social etiquettes, and dismantling established forms. While some creatives are in it for the money, you may not be as interested in repeating patterns of what came before simply because they proved successful. Consider those creatives that have broken patterns of presenting their creations within their own disciplines like video games, music or visual art, dance, theater, or film. What they share in common are their iterative attempts to disrupt an audience’s expectations. How might you integrate generative AI to disrupt your own established patterns of creation? How might generative AI support your own creative process to evolve? Why might you feel a need to do so?

无论头条新闻具有挑衅性,也无论其野火蔓延的病毒性质如何,法学硕士都要求我们就生成人工智能进行一场非常真实的对话,以某种方式形成观点。这些观点可以是通过您自己的研究获得的丰富信息,与您通常阅读或聆听的人一致,也可以是基于您的直觉、生活经验、您看待世界的方式以及技术在其中的作用的直觉反应。无论生成式人工智能支持、中断或拆除哪些人类活动,表达你对生成式人工智能观点的批评声音都很重要。正如无数其他创新技术一样,生成式人工智能也是如此。选择你自己的冒险,了解包括你自己在内的批评声音,并为了他人的利益而改变你对技术的使用(图1-7)。

A photograph of an A I generated image depicts robotic hands grasping a hollow globe of Earth. A small robot is placed on the globe. Tiny planets are scattered around the globe.

A photograph of an A I generated image depicts robotic hands grasping a hollow globe of Earth. A small robot is placed on the globe. Tiny planets are scattered around the globe.

图1-7

图像-图像AI中包含提示“人工智能手中的虚拟世界”,以及作者手里拿着地球仪的旧图像。作者的身体被裁剪掉,两种风格的过滤器链提示了 78 次迭代,应用了时尚、玩具和布料过滤器

The prompt “A virtual world in the hands of an AI” was included in an image-image AI along with an old image of the author holding a globe of the earth in his hand. The author’s body was cropped out, and two style filters chain prompted 78 iterations with fashion, toy, and cloth filters applied

致谢

Acknowledgments

  • 约翰·格林·钱德勒 (John Greene Chandler) 创作了令人难忘且至今仍具有现实意义的《小鸡的非凡故事》

  • John Greene Chandler for the memorable and still relevant Remarkable Story of Chicken Little

  • 任何曾经绘制、绘画或 3D 建模机器人、狐狸、鸡和地球的人

  • Anyone who has ever drawn, painted, or 3D modeled robots, foxes and chickens, and the earth

  • 任何创造过草地、天空和其他自然物体的人

  • Anyone who has created grass, skies, and other natural objects

  • 为我们提供 ChatGPT-4 等创意原型的团队

  • The teams who have offered us the creative prototypes like ChatGPT-4

  • 卡莱斯图斯·朱马 (Calestous Juma) 撰写了令人振奋的著作《创新及其敌人:人们为何抵制新技术》

  • Calestous Juma for his invigorating book Innovation and Its Enemies: Why People Resist New Technologies

  • 世界各地的艺术家通过艺术、设计和恶作剧勇敢地挑战人类对待彼此、地球和自己的方式

  • Artists everywhere who have bravely challenged the way that humans do things to each other, the planet, and themselves through art, design, and mischief

  • 毕加索、卡罗、陀思妥耶夫斯基、安杰卢、班克斯

  • Picasso, Kahlo, Dostoevsky, Angelou, Banksy

参考

References

巴特勒,S.(1863)。 “机器中的达尔文”,205。

Butler, S. (1863). “Darwin Among the Machines,” 205.

钱德勒,JG (1855)。小鸡的非凡故事。 WJ 雷诺兹公司,康希尔 24 号。

Chandler, J. G. (1855). Remarkable Story of Chicken Little. W. J. Reynolds & Company, 24 Cornhill.

洛夫莱斯,AA (1843)。译者注释(对 LF Menabrea 的“查尔斯·巴贝奇发明的分析机的草图”的翻译的附录)。科学回忆录,选自《外国科学院学报》和《外国期刊》,3, 691–731。

Lovelace, A. A. (1843). Notes by the translator (addenda to her translation of LF Menabrea’s “Sketch of the Analytical Engine Invented by Charles Babbage”). Scientific Memoirs, Selected from the Transactions of Foreign Academies of Science and Learned Societies, and from Foreign Journals, 3, 691–731.

温纳,L. (2010)。鲸鱼和反应堆:在高科技时代寻找极限。芝加哥大学出版社。

Winner, L. (2010). The Whale and the Reactor: A Search for Limits in an Age of High Technology. University of Chicago Press.

© 作者,获得 APress Media, LLC(Springer Nature 2023 一部分)的独家许可
P. Parra Pennefather利用生成式 AI设计思维进行创意原型设计https://doi.org/10.1007/978-1-4842-9579-3_2

2. 发挥机器的创造力

2. Being Creative with Machines

帕特里克·帕拉·雷柏1  
(1)
不列颠哥伦比亚大学,加拿大不列颠哥伦比亚省温哥华
 

第一章要求你探索使用生成式人工智能来支持你自己的创作过程的可能性,同时了解这样做的利弊。本章重新审视智能机器的起源故事以及人类想象机器具有创造力和智能的方式。了解智能机器在您自己的创作过程中发挥作用是一项有价值的事业。虽然本章没有对支持人类创造力的所有技术进行深入的历史回顾,但它可以指出对生成式人工智能提供的可供性和限制具有重要意义的技术。例如,找到一些导致文本图像生成人工智能创建的人类历史发明中的一些,将为您提供另一种视角,了解人类智能和行为的模拟如何支持而不是取代人类创造力。创意人员将受益于理解生成式人工智能是另一种源自人类想象力的技术工具,可用于他们自己的创意过程。生成式人工智能是引人注目的发明,因为这些看似智能的机器变得更像原型制作伙伴,具有创意人士会发现有用的独特功能。

Chapter 1 challenged you to explore the possibility of using generative AI to support your own creative process while being aware of the pros and cons of doing so. This chapter re-examines the origin stories of intelligent machines and the way that humans imagined a machine to be creative and intelligent. Understanding where the intelligent machine comes into play when it comes to your own creative process is a valuable undertaking. While this chapter does not provide an in-depth historical review of all the technologies that have supported human creativity, it can point to ones that are significant to the affordances and constraints that generative AI offer. Locating some of the many historical human inventions that have led to the creation of text-image generative AI, for example, will provide you with another perspective of how the simulation of human intelligence and behavior has come to support, not replace, human creativity. Creatives will benefit from understanding that generative AI is another technological tool arising from human imagination that can be used in their own creative process. Generative AI are compelling inventions as these seemingly intelligent machines become more like prototyping companions that have unique features creatives will find useful.

智能机器

Intelligent Machines

如果你进行创作,那么在你的创作过程中的某个时刻,你将不可避免地与某种技术进行交互。技术来自于我们的想象,而不是相反。技术不断激发艺术创作、设计和人类世界体验的新创意方法。支持人类创造力的技术清单非常详尽。对我来说幸运的是,我有一名法学硕士可供我使用,他可以提供您可以进行事实核查的粗略历史。

If you create, then at some point in your creative process, you will inevitably interact with some kind of technology. Technologies have come from our imagination, not the other way around. Technology has persistently incited new creative approaches to artmaking, design, and the human experiences of the world. The list of technologies that have been imagined to support human creativity is exhaustive. Lucky for me, I have an LLM at my disposal who can provide a rough history that you can fact-check.

LLM以下是一些历史亮点。

LLM: Here are some historical highlights.

石器 (约 260 万年前):人类创造力和表达的最早证据来自史前时代,当时我们的祖先制作石器。这些工具不仅用于生存任务,还用于在洞穴墙壁和物体上雕刻和创造早期艺术形式。

Stone Tools (~2.6 million years ago): The earliest evidence of human creativity and expression comes from prehistoric times when our ancestors crafted stone tools. These tools were not only used for survival tasks but also for carving and creating early art forms on cave walls and on objects.

涂料 (约 4 万年前):涂料的出现使得洞穴壁画中的描绘变得更加复杂和生动。它开辟了一种全新的视觉叙事方式。

Paint (~40,000 years ago): The creation of paint allowed for more complex and vibrant depictions in cave paintings. It opened up a whole new way of visual storytelling.

书写工具 (约公元前 4000 年):手写笔和早期墨水的发明使古代文明能够在泥板和纸莎草上记录信息并创作艺术品。

Writing Instruments (around 4000 BC): The invention of stylus and early forms of ink allowed ancient civilizations to record information and create artwork on clay tablets and papyrus.

(大约公元 100 年):纸发明于中国,为艺术家提供了一种轻质、便携的表面,供艺术家绘画。这彻底改变了人们创作和分享艺术的方式。

Paper (around 100 AD): Invented in China, paper provided a lightweight, portable surface for artists to draw and paint on. This revolutionized the way people could create and share art.

印刷机 (1440):约翰内斯·古腾堡发明的印刷机使得书籍和图像的大规模生产成为可能。这不仅使信息民主化,而且影响了艺术风格,导致了蚀刻和雕刻等技术的出现。

Printing Press (1440): Johannes Gutenberg’s invention of the printing press allowed for the mass production of books and images. This not only democratized information but also influenced artistic styles, leading to techniques like etching and engraving.

相机 (十九世纪):摄影的发明使艺术家能够以一种新的、创新的方式捕捉现实。它还影响了绘画等许多其他艺术形式,导致了现实主义等运动,并后来激发了抽象表现主义等新的艺术形式。

Camera (nineteenth century): The invention of photography allowed artists to capture reality in a new and innovative way. It also influenced many other art forms like painting, leading to movements such as realism and later inspiring new art forms like abstract expressionism.

合成颜料 (十九世纪中叶):合成颜料的发展扩展了艺术家的调色板,超越了传统的天然颜料。这使得艺术作品更加充满活力和持久。

Synthetic Paint (mid-nineteenth century): The development of synthetic pigments expanded the artist’s color palette beyond the traditional, naturally derived pigments. This allowed for more vibrant and long-lasting artwork.

电影 (十九世纪末):电影的发明为艺术家探索和表达创造力开辟了全新的媒介,从而导致了电影业的诞生。

Film (late nineteenth century): The invention of moving pictures opened up a completely new medium for artists to explore and express their creativity, leading to the birth of the film industry.

计算机 (二十世纪中叶):计算机以及后来的图形软件为艺术家提供了创作数字艺术的新媒介。这扩大了创建和共享内容的可能性。

Computer (mid-twentieth century): Computers, and later graphics software, gave artists a new medium to create digital art. This expanded the possibilities of what could be created and shared.

互联网 (二十世纪末):互联网彻底改变了艺术的共享和消费。它为艺术家提供了一个与世界分享作品的平台,并开辟了新的艺术形式,如网页设计和数字动画。

Internet (late twentieth century): The Internet revolutionized the sharing and consumption of art. It provided a platform for artists to share their work with the world and also opened up new forms of art, like web design and digital animation.

3D打印 (二十世纪初):3D打印使艺术家能够用各种材料创作复杂的三维作品,为雕塑和装置艺术开辟了新的机遇。

3D Printing (early twenty-first century): 3D printing has allowed artists to create complex, three-dimensional works in a variety of materials, opening up new opportunities for sculpture and installation art.

你可以看到法学硕士关于技术构成的想法是广泛的。在技​​术的众多定义中,我最喜欢的是科幻作家兼幽默家道格拉斯·亚当斯的定义,他说过“技术”一词是用来描述尚未完全发挥作用的事物的。人类长期以来一直痴迷于共同构建可以支持创造行为的不完美技术。

You can see that the LLM’s ideas of what constitutes technology are broad. Amid the numerous definitions of technology, my favorite is science fiction writer and humorist Douglas Adams’s, who said something like technology being a word that describes something that doesn’t quite work yet. Humans have long been obsessed with co-constructing imperfect technology that can support acts of creation.

与生成式人工智能互动的原因之一是从中汲取灵感。在创作过程中,创意人员一直在相互合作。当周围没有人时,您可以使用人工智能来快速测试想法。这些可能会给你关于下一步该做什么或者一个完全充实的视觉想法可能是什么样子的想法。法学硕士甚至可能会就您的想法提供反馈。如果您对想要创造的东西有一个令人惊叹的想法,并且您想知道在您之前是否有人已经做过,那么情况尤其如此。有时我们没有语言来表达我们想要发明什么,因此如果不投入大量时间和精力,在互联网上进行搜索可能不会得到结果。虽然这类似于法学硕士回应您的查询的方式,但您可能会更快地获得知识领域的广泛概念,或者人工智能可能会指出您在其他地方找不到的直接来源,您可以在网上进一步研究。我自己对使用智能机器进行创作的兴趣和投资受到了我童年时与自动钢琴的互动的影响,那么为什么不促使法学硕士讲述自己的起源故事呢?创造性地这样做可能会产生成果,并展示如何从人工智能中激发想法,以推动进一步的创造性研究。

One reason to interact with a generative AI is to bounce ideas off it. Creatives do this all the time with one another during the creation process. When there’s no one around, you can access an AI to test out ideas rapidly. These might give you ideas as to what to do next or what a fully fleshed-out visual idea might look like. An LLM might even give you feedback on an idea that you have. This is especially the case if you have an amazing idea for something you want to create and you want to know if it has been done by others before you. Sometimes we don’t have the language to articulate what it is we want to invent, so conducting a search on the Internet might not get you results without a lot of time and effort put into it. While this is similar to how an LLM might respond to your queries, you will likely receive a broad idea of a knowledge area faster, or the AI might point to a direct source you couldn’t find elsewhere that you can further research online. My own interest and investment in using intelligent machines to create was influenced by interactions with a player piano in my childhood, so why not prompt an LLM to tell its own origin story? Doing so creatively may yield results and demonstrate how you can bounce ideas off an AI to propel further creative research.

作者从历史角度讲,您会从哪里开始讲述机器智能的故事?为什么?

Author: Where would you start historically in telling the story of machine intelligence and why?

LLM机器智能的故事可以追溯到古代,当时哲学家和发明家第一次开始考虑创造可以模仿人类智能的机器。然而,随着计算机的出现和 人工智能(AI) 作为一门学科的发展,对机器智能作为一个研究领域的更全面的理解直到二十世纪才开始出现。标志着现代人工智能开始的一个重大事件是 1956 年的达特茅斯会议,该会议被广泛认为是人工智能这一研究领域的诞生地。这次会议聚集了一群计算机科学家、数学家和工程师,他们都有共同的兴趣,即探索计算机执行通常需要人类智能的任务的潜力,例如识别模式、从经验中学习和解决问题。

LLM: The story of machine intelligence can be traced back to ancient times, when philosophers and inventors first began contemplating the idea of creating machines that could mimic human intelligence. However, a more comprehensive understanding of machine intelligence as a field of study only began to emerge in the twentieth century, with the advent of computers and the development of artificial intelligence (AI) as a discipline. One significant event that marked the beginning of modern AI was the Dartmouth Conference in 1956, which is widely considered as the birthplace of AI as a field of study. The conference brought together a group of computer scientists, mathematicians, and engineers who shared a common interest in exploring the potential of computers to perform tasks that typically require human intelligence, such as recognizing patterns, learning from experience, and solving problems.

当你从法学硕士那里得到想法时,你会参与一个迭代过程,随着你与人工智能的对话关系的发展,这个过程可以激发你变得更加具体。这个过程被称为连锁提示。在前面的例子中,我获得了两个新的见解。第一个是发明家想要创造模仿人类智能的机器的想法,这让我很好奇哪些发明家、什么机器以及人类痴迷于这样做的时间有多早。第二个是进一步研究人工智能一词何时出现的良好起点,它的定义是什么,谁出现在历史的那个时刻,以及在那段时间可能发生了哪些其他发展。前面示例中提示 AI 的想法也可以重写为文本图像生成 AI 的输入,以了解 AI 在搜索自己的起源故事时如何描绘自己(图2-1)。

When you bounce ideas off an LLM, you engage in an iterative process that can inspire you to get more specific as your conversational relationship with an AI develops. This process is referred to as chain prompting. In the preceding example, I gained two new insights. The first was the idea of inventors wanting to create machines mimicking human intelligence that makes me curious about which inventors, what machines, and how far back in time humans have been obsessed with doing so. The second was a good starting place to further research when the term artificial intelligence came to be, what the definitions of it were, who was present in that moment in history, and what other developments might have been occurring during that time. The idea of prompting an AI in the preceding example can also be rewritten as input for a text-image generative AI to see how an AI might depict itself as it searches for its own origin stories (Figure 2-1).

A photograph of a robot looking at art in a museum.

A photograph of a robot looking at art in a museum.

图2-1

作者在画廊中欣赏艺术品的照片被输入到图像-图像人工智能中,并提示“人工智能在生命博物馆中寻找自己的起源故事”。使用时尚杂志风格滤镜,然后 Photoshop 中的神经滤镜增强了风格、绘画叠加和 3D 深度。总迭代次数 = 45

A photo of the author looking at art in a gallery fed into an image-image AI with the prompt “An AI searching for its own origin stories in the museum of life.” A fashion magazine style filter was used, and then neural filters in Photoshop enhanced the style, painting overlay, and 3D depth. Total iterations = 45

模拟人类创造性智能

Simulating Human Creative Intelligence

在计算机和机器智能的发展之前,已经有其他相关的技术创新支持和启发了新的艺术表现形式。能够模仿人类创造力和智力的智能机器之间的关系有着悠久的历史。加深我们对其中一些历史机器的理解,可以说明生成式人工智能为各种创意者提供了独特的功能,以促使机器生成可用于增强其自身创造力的内容。

Prior to computers and the evolution of machine intelligence, there have been other related technological innovations that have supported and informed new artistic expressions. The relationship of intelligent machines that can emulate human-like creativity and intelligence has a long history. Deepening our understanding of some of those historical machines can speak to the unique affordances that generative AI gives creatives of all kinds to prompt a machine to generate content that can be used to augment their own creativity.

历史表明,当机器表现出类似人类的行为时,我们就赋予它们某种智能。可以说,能够模仿人类智能的机器的出现比二十世纪人工智能的出现要早得多,对于创意人员来说,理解这种优先性是很有价值的。当前人类对人工智能的迷恋可以从解决问题和激发创新创造的奇妙机器的故事和历史文献中找到。

History has shown that when machines demonstrate human-like behaviors, we grant them a sort of intelligence. Machines that could mimic human intelligence can be said to start much earlier than the emergence of AI in the twentieth century, and it’s valuable for creatives to understand that there is precedence. The current human fascination with AI can be located with stories and historical documents of wonderful machines that solved problems and inspired innovative creations.

自动机

Automata

在古希腊,“自动机”一词的意思类似于“按照自己的意愿行事”。自动机的记录记录揭示了两个重要的考虑因素:第一是它们通常被构造为模仿需要一定程度智能的人类活动,第二,它们也被用作支持人类活动的功能工具,例如时钟,或者它们代表可以大规模建造的小型化原型。

In ancient Greece the word automata meant something akin to “acting of one’s own will.” Documented accounts of automata reveal two important considerations: the first is that they often were constructed to mimic human activities that required a degree of intelligence, and second, they were also used as functional tools to support human activities such as a clock, or they represented miniaturized prototypes that could be built on a larger scale.

通常建造微型自动机是为了演示事物如何运作并展示一些科学原理。海伦 (Heron) 的翻译文本对此进行了解释,他是一位才华横溢的数学家、物理学家和工程师,生活在公元 10-70 年左右。他的三部著作《气动》、《机械》和《自动机》提供了存在数百种能够自动运动的各种类型机器的证据。他的叙述表明,古希腊人正在将自动机(机械装置)的概念融入他们的日常生活中。海伦的发明包括自动门、一只唱歌的鸟和一个完整的自动化木偶剧院,能够使用绳索、结和简单的机器播放十分钟的戏剧。

Miniature automata were often constructed in order to demonstrate how things worked and to show some scientific principles. This is accounted for in the translated texts of Heron who was a talented mathematician, physicist, and engineer and lived around 10–70 AD. His three written works, “Pneumatica,” “Mechanica,” and “Automata,” provide evidence of the existence of hundreds of various types of machines capable of automated movement. His accounts show that the ancient Greeks were incorporating the concept of automata (mechanical devices) into their daily lives. Heron’s inventions included automated doors, a singing bird, and an entire automated puppet theater capable of playing a ten-minute drama using ropes, knots, and simple machines.

法学硕士不要忘记,以机器人为特色的艺术表现和戏剧作品可以追溯到公元前三世纪左右的汉朝古代中国。在此期间,一个令人印象深刻的机械管弦乐队被制作出来,还有各种各样的机械化玩具,如空中自动机、鸽子和鱼、天体和神话龙的机械代表,以及自动操作的杯子服务器。这些机械奇迹主要由液压机构驱动,是工程师和工匠专门为娱乐皇帝而设计的,而他们的身份在很大程度上已经被遗忘。

LLM: Don’t forget that artistic representations and theatrical productions featuring robotics can be traced back to ancient China during the Han dynasty, around the third century BC. During this time, an impressive mechanical orchestra was crafted, along with an assortment of mechanized playthings such as airborne automatons, mechanical representations of doves and fish, celestial beings and mythical dragons, and self-operating cup servers. These mechanical marvels, predominantly driven by hydraulic mechanisms, were specifically designed to entertain emperors by engineers and craftsmen, whose identities have largely faded into oblivion.

由法学硕士引发的更多研究最终可能会导致唐朝的《水利诸侯书》。中国的机械奇迹通常由水、重力或其他简单的机械方法提供动力。其中许多设备是出于实际原因而创建的,例如测量时间或检测自然现象(图2-2)。不过,它们也有娱乐用途,其中机械娃娃和木偶戏颇受欢迎

A photograph of a chariot. It has a transparent body resting on two large wheels.

A photograph of a chariot. It has a transparent body resting on two large wheels.

图2-2

南向战车是中国早期导航装置的推测模型,使用差动齿轮,清晰且升级。由英国伦敦科学博物馆的维基共享资源 Andy Dingley 提供。https://creativecommons.org/licenses/by/3.0/deed.en

South-pointing chariot, a conjectural model of a Chinese early navigational device using a differential gear, unblurred and upscaled. Courtesy of Andy Dingley via Wikimedia Commons, Science Museum in London, England. https://creativecommons.org/licenses/by/3.0/deed.en

中国的发明可分为以下几类:
  • 科学仪器:张衡的地动仪或苏松的天文钟楼,用于测量和记录科学数据。

  • Scientific Instruments: Zhang Heng’s seismoscope or Su Song’s astronomical clock tower used to measure and record scientific data.

  • 计时装置:许多自动机都是精致的时钟,使用机械运动来指示时间。

  • Timekeeping Devices: Many automata were elaborate clocks, using mechanical movements to indicate time.

  • 机械玩具和木偶:这些用于娱乐目的,包含可以移动或执行动作的人物。

  • Mechanical Toys and Puppets: These were used for entertainment purposes incorporating figures that would move or perform actions.

  • 乐器:朱载堉的自动笛子是为了通过复杂的机械手段演奏音乐而设计的。

  • Musical Instruments: Zhu Zaiyu’s automatic flute player was designed to play music through complex mechanical means.

  • 水利发明:在《水利优雅之书》中,这些发明使用水来驱动机械人物或执行其他任务。

  • Hydraulic Inventions: In the Book of Hydraulic Elegancies, these used water to animate mechanical figures or perform other tasks.

古代音乐机器人乐队

Ancient Musical Robot Bands

预编程智能机器已经存在很长时间了。在大多数情况下,它们仅限于创建者设计和构建的特定操作。将它们视为预先编程的移动机器人,它们执行重复的特定机械运动。 Al-Jazari于 1206 年撰写的《巧妙机械装置知识之书》详细介绍了 50 种机械装置并提供了其构造说明。书中描述的最著名的设备之一是音乐机器人乐队,它被认为是最早的可编程自动机之一(图2-3)。 Al-Jazari 以创造机械设备的独创性而闻名,他设计了一种船形音乐自动机,其中有四名自动音乐家。这个音乐机器人被用来在皇家酒会上招待客人。

Pre-programmed intelligent machines have been around for a long time. For the most part, they were limited to specific actions that a creator designed and built. Think of these as pre-programmed moving robots that perform specific mechanical motions that repeat. Al-Jazari’s The Book of Knowledge of Ingenious Mechanical Devices, which he wrote in 1206, details 50 mechanical devices and provides instructions on their construction. One of the most notable devices described in the book is the musical robot band, considered to be one of the first programmable automata (Figure 2-3). Al-Jazari, known for his ingenuity in creating mechanical devices, designed a musical automaton in the form of a boat featuring four automated musicians. This musical robot was used to entertain guests at royal drinking parties.

A photograph of a painting of a royal drinking party with a mechanism below.

A photograph of a painting of a royal drinking party with a mechanism below.

图2-3

Al-Jazari 音乐自动机的真实图像,可以追溯到 13 世纪 Al-Jazari 关于自动机的论文 Kitab fi ma'ari-fat al-hiyal al-handasiya(公元 1206 年)。由维基共享资源提供

A real image of Al-Jazari’s musical automaton that dated back to the thirteenth century from Al-Jazari’s treatise on automata, Kitab fi ma’ari-fat al-hiyal al-handasiya (1206 CE). Courtesy of Wikimedia Commons

谢菲尔德大学人工智能、机器人学和公众参与教授 Noel Sharkey表示,Al-Jazari 的音乐机器人乐队是可编程自动机的早期实例(图2-4)。夏基还尝试重新创建该机制,其特点是由凸轮控制的鼓机,凸轮触发打击乐器的杠杆。通过重新排列凸轮的位置可以改变鼓的图案。自动机不仅播放音乐,还在每次音乐表演中表演 50 多个面部和身体动作。

Professor of AI and robotics and public engagement at the University of Sheffield Noel Sharkey suggests that Al-Jazari’s musical robot band was an early instance of a programmable automaton (Figure 2-4). Sharkey has also attempted to recreate the mechanism, which features a drum machine controlled by cams that trigger levers for the percussion instruments. The drum patterns could be altered by rearranging the positions of the cams. The automata not only played music but also performed over 50 facial and body movements during each musical performance.

A photograph of an automated musical robot with a drum machine.

A photograph of an automated musical robot with a drum machine.

图2-4

Al-Jazari 的古代音乐机器人乐队由 AI 进行演绎,并附有一张应用了风格过滤器的 Al-Jazari 手稿存档照片。迭代次数 = 93

Al-Jazari’s ancient musical robot band interpreted by an AI prompted with an archived photo of Al-Jazari’s manuscript with a style filter applied. Iterations = 93

创意人员的要点

Key Takeaways for Creatives

  • 最早的自动机是相当简单的机器,通常仅使用几个齿轮和弹簧来产生复杂的运动。同样,在创作过程中,您可以从简单的元素开始,然后以复杂的方式将它们组合起来,创造出真正独特的东西。您可以从简单的提示开始,随着您根据收到的内容继续完善它们,提示可能会变得更加复杂。如果您记住人工智能生成的第一个内容是未完成的,那么您可以自由地迭代它,随着时间的推移改进该内容,无论是使用人工智能还是您自己。

  • The earliest automata were fairly simple machines, often using just a few gears and springs to create intricate movements. Similarly, in the creative process, you can start with simple elements and combine them in complex ways to create something truly unique. You can begin with simple prompts, which can become more elaborate as you continue to refine them based on the content that you receive. If you keep in mind that the first content an AI generates is unfinished, then you are free to iterate on it, improving that content over time, whether with an AI or on your own.

  • 建造自动机需要了解多个学科,包括机械、艺术,通常还包括讲故事。创意人员可以利用各种生成式人工智能来创造跨学科的作品,特别是如果这能激发他们开发新技能,应用他们的技术从不熟悉的学科生成内容,并融合想法以形成创新和新的东西。

  • Building automata requires an understanding of multiple disciplines, including mechanics, art, and, often, storytelling. Creatives can draw from a wide variety of generative AI to create interdisciplinary work, especially if that inspires them to develop new skills, apply their technique to generating content from an unfamiliar discipline, and blending ideas to form something innovative and new.

书写和绘画的自动机

Automata That Wrote and Drew

文本-​​文本和图像生成机器学习模型的古老祖先可以在瑞士发明家的机械魔法中编程的自动机中看到。这些圆滑而复杂的机械“魔鬼”在日益机械化的十八世纪欧洲风靡一时(图2-5)。

The ancient ancestor of text-text and image-generating machine learning models can be seen in the automata that were programmed in the mechanical wizardry of Swiss inventors. These slick and complicated mechanical “devils” became all the rage in a growing mechanistic eighteenth-century Europe (Figure 2-5).

A photograph of two robot boys on both sides of a female robot. Each is in a transparent box. The boys write and the female plays music.

A photograph of two robot boys on both sides of a female robot. Each is in a transparent box. The boys write and the female plays music.

图2-5

雅克德罗自动机,纳沙泰尔艺术与历史博物馆,维基共享资源。https://creativecommons.org/licenses/by-sa/2.0/fr/deed.en

Jaquet-Droz automata, Musée d’Art et d’Histoire de Neuchâtel, Wikimedia Commons. https://creativecommons.org/licenses/by-sa/2.0/fr/deed.en

1768 年至 1774 年间,皮埃尔·雅克德罗 (Pierre Jaquet-Droz)、他的儿子亨利·路易 (Henri-Louis) 和让·弗雷德里克·雷肖特 (Jean-Frédéric Leschot) 合作创造了三台非凡的自动机,包括“作家”(由 6000 个零件组成)、“音乐家”(由 2500 个零件组成)部分)和“绘图员”(2000 部分)。这些工程小奇迹以其复杂的机制吸引了欧洲、中国、印度和日本的观众。一些专家将这些设备视为计算机最早的例子。 “作家”是一个用羽毛笔和真正的墨水在纸上写字的机械男孩,他有一个制表符设置输入设备,可用作可编程存储器。它由 40 个摄像头驱动,充当其只读程序。这些作品是人类在解决机械问题方面最伟大的成就之一。跟随雅克德罗 (Jaquet-Droz) 的发条步伐,发明家亨利·梅亚德 (Henri Maillardet) 于 1805 年建造了一台弹簧驱动的自动机,可以用法语和英语绘制图像和创作诗歌。自动机的手部运动是通过位于底座轴上的一系列凸轮创建的,这些凸轮产生执行七个草图和随附文本所需的运动。该自动机被认为是当时所有自动机中拥有最广泛的基于凸轮的存储器。

Between 1768 and 1774, Pierre Jaquet-Droz, his son Henri-Louis, and Jean-Frédéric Leschot collaborated to create three remarkable automata, consisting of “The Writer” (composed of 6000 parts), “The Musician” (made up of 2500 parts), and “The Draughtsman” (2000 parts). These little marvels of engineering captivated audiences in Europe, China, India, and Japan with their intricate mechanisms. Some experts view these devices as the earliest examples of computers. “The Writer,” a mechanical boy who writes with a quill pen and real ink on paper, has a tab-setting input device that functions as a programmable memory. It is powered by 40 cams that act as its read-only program. The works are some of the greatest human achievements in mechanical problem-solving. Following in the clockwork pace of Jaquet-Droz, inventor Henri Maillardet constructed a spring-driven automaton in 1805 that could draw images and compose verses in both French and English. The automaton’s hand movements were created through a series of cams positioned on shafts at the base, which generated the necessary movement to execute seven sketches and accompanying text. This automaton is considered to have the most extensive cam-based memory of any automaton from that time.

创意人员的主要收获

Key Takeaway for Creatives

  • 许多自动机迫使我们通过它们的动作、动作和顺序来想象故事。雅克德罗(Jaquet-Droz)的“作家”能够编写自定义文本,通过机械“笔迹”讲述故事。当时这个小小的举动引起了观众的好奇,每个人都想知道这个角色在写什么。这个自动机还强调了人类对拟人化机器的重要性。 “作家”不一定需要人类来创作文字,但这样做可以让人们与它产生更多的联系。

  • Many automata compelled us to imagine stories through their movements, actions, and sequences. “The Writer” by Jaquet-Droz was capable of writing a custom text, telling a story through its mechanical “handwriting.” At the time this small feat sparked curiosity in audiences, and everyone wanted to know what the character was writing. This automaton also highlights the importance that humans also have with anthropomorphizing machines. “The Writer” did not necessarily need a human to generate the writing, but in doing so the feature made people relate to it more.

当钢琴能弹钢琴时为什么还要弹钢琴?

Why Play the Piano When the Piano Can Play the Piano?

钢琴的发明比雅克德罗(Jaquet-Droz)早一点,值得指出,它是一项已经迭代了三个多世纪的技术的一个很好的例子。克里斯托福里的关键创新是一种允许用锤子敲击钢琴内部琴弦的机制。 “锤子动作”使演奏者能够根据施加在琴键上的力来控制音符的响度。这种表现力的品质使作曲家能够创作出具有表现力范围和动态控制的音乐,探索了音乐表现力的广阔新领域。钢琴是一项创新,引发了音乐创作的革命,钢琴成为十八、十九世纪欧洲新兴中产阶级的必备创新。人类不可避免地会进化出能够自动弹奏的钢琴。自二十世纪之交发明以来,自动钢琴作为一种看似智能且富有创造力的机器,不断发展,至今仍然在酒店大堂和餐厅中神奇地自行演奏,而不需要被注意到。

A bit earlier than Jaquet-Droz, it is worth pointing to the invention of the piano as a good example of a technology that has been iterated on for over three centuries. Cristofori’s key innovation was a mechanism that allowed for the strings inside the piano to be struck with a hammer. The “hammer action” enabled the player to control the loudness of the note based on the force applied to the keys. This expressive quality allowed composers to create music with an expressive range and dynamic control who explored a vast new spectrum of musical expression. The piano is an innovation that led to a revolution in music composition, and the piano became the must-have innovation for a rising European middle class in the eighteenth and nineteenth centuries. It was inevitable that humans would evolve a piano that could play itself. The evolution of the player piano since its turn-of-the-twentieth-century invention, as a seemingly intelligent and creative machine, still has sonic presence in hotel lobbies and restaurants magically playing on its own without the need to be noticed.

二十世纪之交获得专利的自动钢琴显示了人类对机器像人类一样演奏的渴望。我们还看到了具有可编程功能的机器。自动演奏钢琴由录制在穿孔纸或金属卷上的编程音乐组成。这些使音乐制作变得更加容易,任何人都可以使用踏板和控制杆控制节奏和其他效果。也被称为价格较高的再现钢琴,甚至可以模仿艺术家演奏的细微差别,要求用户通过重复踩下立式钢琴下方的脚踏板来简单地泵出音乐。最终,它们变成了电力驱动,最大限度地减少了用户的努力。在娱乐场所内放置的投币式钢琴中,我们不仅看到了自动点唱机和其他设备的出现,而且还看到了随着生成式人工智能的发展而不断发展的付费机器模式的支持者。自动钢琴是最早能够“存储”数据并播放数据的机器的例子。音符形式的数据被“加载”,然后可以通过一些人机交互来激活,以按预先编程的顺序弹奏钢琴上的特定琴键。文本音乐生成人工智能和自动演奏钢琴有很多共同点。这两个系统都依赖于某种形式的编码信息。在自动钢琴中,这是代表音符及其计时的穿孔卷轴。在文本到音乐的生成人工智能中,编码信息是可能描述音乐特征或可能根据特定规则或模式转换为音乐的文本输入。就像自动钢琴根据钢琴卷帘中的编码信息自动演奏音乐一样,文本到音乐人工智能根据对文本输入的理解生成并演奏一段音乐。自动钢琴通过机械地读取卷轴来生成音乐,而文本到音乐的人工智能则通过算法解释文本输入来生成音乐。

Player pianos patented close to the turn of the twentieth century show the human desire for machines to perform like a human, literally. We also see a machine with programmable capabilities. Player pianos consisted of programmed music recorded on perforated paper or metallic rolls. These made music production easier by allowing anyone the ability to control tempo and other effects with a treadle and levers. The pricier reproducing pianos as they were also called could even imitate the playing nuances of an artist tasking users to simply pump the music out by pressing on a foot pedal beneath the upright piano repetitively. Eventually they became electrically powered minimizing the effort of the user. In coin-operated pianos placed inside of entertainment venues, we not only see the advent of the jukebox and other devices, but we have a proponent for the pay-the-machine model that is evolving with the development of generative AI. Player pianos were examples of some of the first machines to be able to “store” data and play it back. That data in the form of musical notes was “loaded” and then could be activated by some human interaction to play specific keys on the piano in pre-programmed sequences. Text-music generative AI and the player piano share a lot in common. Both systems rely on some form of coded information. In a player piano, this is the perforated roll that represents musical notes and their timings. In a text-to-music generative AI, the coded information is the text input that might describe musical characteristics or might be transformed into music based on specific rules or patterns. Just like a player piano automatically performs music based on the coded information in the piano roll, a text-to-music AI generates and performs a piece of music based on its understanding of the text input. A player piano generates music by mechanically reading through a roll, and a text-to-music AI generates music by algorithmically interpreting the text input.

请注意,两者之间存在明显差异。关键在于,自动演奏钢琴并没有真正“编码”来生成像文本到音乐人工智能那样的音乐混搭,其结果可能从令人惊讶到绝对可怕。生成的流行音乐让音乐艺术家感到震惊,因为他们抱怨这听起来像他们,没有请求许可,而且他们基本上被排除在音乐创作过程之外。然而,回想一下,人工智能将根据其训练的数据生成内容。它将寻找模式,如果这些模式导致的内容在本例中听起来像原始内容,那么它的工作就完成了。对于音乐才华横溢的创意人士来说,重要的一课是认识人工智能正在生成哪些音乐模式。许多流行音乐听起来都一样,因为歌曲往往依赖于重复的基础和弦模式、节奏或别人音乐的采样块。每种音乐风格都有自己的模式,唱片公司依靠这些模式来继续销售该音乐。很少有人提出新的和弦、旋律和节奏模式。生成式人工智能可以为我们提供一首具有颠覆性的声音生成歌曲,这可能会激励我们打破惯例,并利用它生成的内容来推动创新的声音创作。

Mind you there are stark differences. The key one is that player pianos were not really “coded” to generate musical mashups like text-to-music AI can, whose results can range from astonishing to absolutely terrible. Generated pop music has alarmed musical artists because they complain that it sounds like them, that no permission was requested, and that they are basically left out of the musical creation process. Recall, however, that an AI will generate content based on the data it is trained on. It will look for patterns, and if those patterns lead to content that in this case sounds like the original, then its job is done. For musically talented creatives, the important lesson is to recognize what patterns in music an AI is generating. A lot of pop music sounds the same because songs tend to rely on repetitive underlying chordal patterns, rhythms, or sampled chunks of someone else’s music. Each style of music has its own patterns that record companies rely on in order to continue to sell that music. Rarely are new chordal, melodic, and rhythmic patterns proposed. Generative AI can offer us a disruptive sounding generated song that might inspire us to break convention and use what it generates to propel an innovative sonic creation.

创意人员的要点

Key Takeaways for Creatives

  • 迭代开发:自动演奏钢琴随着时间的推移的发展展示了迭代和坚持的力量。自动钢琴的第一批型号通常非常基本,但通过不断的改进和创新,它们演变成更加复杂和功能强大的设备。应用于生成式人工智能,通过迭代完善和坚持不懈,你的创意输出将会变得更好。还有许多开发人员正在尝试生成人工智能,将 LLM 的内容自动化,以便与第三方应用程序一起使用,例如,对话 LLM 作为 VR 中的聊天机器人来响应语音提示。

  • Iterative Development: The development of the player piano over time demonstrates the power of iteration and persistence. The first models of the player piano were often quite basic, but through continual refinement and innovation, they evolved into far more complex and capable devices. Applied to generative AI, your creative outputs will become better through iterative refinement and perseverance. There are also many developers experimenting with generative AI that automate content from an LLM, to be used with third-party applications, for example, a conversation LLM as a chatbot in VR responding to voice prompts.

  • 设计时要考虑到观众:自动机的设计通常是为了吸引或娱乐观众,无论是以精致的发条显示器还是互动玩具的形式。这强调了考虑您的创作对象的重要性,因为这也会影响您的创作过程。

  • Design with the Audience in Mind: Automata were often designed to captivate or entertain an audience, whether in the form of elaborate clockwork displays or interactive toys. This underscores the importance of considering whom you are making your creation for as that will also impact your creative process.

  • 创新:如果您的目标是作为一名歌手兼作曲家、作曲家或未来的流行艺术家,听起来不像其他人,那么请跟随 Holly Herndon、Taryn Southern 和 Daddy's Car 等人的脚步。

  • Innovate: If your goal is to not sound like everybody else as a singer-songwriter, composer, or would-be pop artist, then follow the leads of Holly Herndon, Taryn Southern, and Daddy’s Car to name a few.

早期的自动机、自动演奏钢琴和其他机械发明展示了人类对构建智能机器的热情(图2-6)。它们被视为智能机器进化中的重要异常值,展示了人类的创造力并经常模仿它。它们是艾伦·图灵、琼·克拉克和其他二十世纪初的工程师、思想家和科学家对现代影响的简短前奏,他们为早期计算机的发展以及此后计算机与机器智能的联系做出了贡献。

Early automata, the player piano, and other mechanical inventions demonstrate the human passion for constructing intelligent machines (Figure 2-6). They are presented as significant outliers in the evolution of intelligent machines that demonstrated human creativity and often imitated it. They are a brief prelude to the modern influence of Alan Turing, Joan Clarke, and other early twentieth-century engineers, thinkers, and scientists who contributed to the evolution of early computers and the association thereafter of computers with machine intelligence.

A photograph of a robot playing a piano. The lower part of the body of the robot is attached to the piano.

A photograph of a robot playing a piano. The lower part of the body of the robot is attached to the piano.

图2-6

作者弹奏一架旧 Heintzman & Co. 钢琴的源图像以及“AI 机器人弹奏自动钢琴”的提示被输入到图像图像生成 AI 中,并应用了滤镜,使其看起来更人性化。然后在 Photoshop 中进行编辑,调整对比度和色调,添加和裁剪新背景。前一图像的总迭代次数 = 14

A source image of the author playing an old Heintzman & Co. piano along with the prompt “AI robot playing a player piano” was fed into an image-image generative AI with an applied filter to make it look more human. It was then edited in Photoshop, contrast and tone adjusted, a new background added and cropped. Total iterations for the preceding image = 14

二十世纪的机器智能

Machine Intelligence in the Twentieth Century

法学硕士也许能够根据其语料库中的内容告诉我们计算智能和创造力可能从哪里开始,这反映了许多出版物中记录的类似历史。大多数对人工智能的提及都是从追溯阿兰·图灵的职业生涯和影响力开始的,他经常在伟大的父权提议中被称为计算的“之父”。他的浪漫恶名依赖于Bombe的巧妙合作发明,该发明解密了德国制造的恩尼格玛机,这是一种能够很好地编码信息的智能机器。然而,还需要解密的是普遍存在的历史习惯,即把那些对密码学有影响力的女性推到一边,她们的团队努力极大地影响了图灵的发明。无论我们是向法学硕士还是通用搜索引擎推荐机器智能发展中的重要人物,密码分析学家琼·克拉克(Joan Clarke)的鲜为人知的故事都是在一场性别歧视的世界大战中幸存下来的故事,在这场战争中,高薪男性期望女性扮演顺从的角色角色。这个故事是一个离群值,很难定位,而且对于人工智能来说没有统计意义,因为它会梳理语料库中的相关历史人物来评估它会反刍什么。克拉克对机器智能发展的影响是智能机器历史的重要组成部分。在当时领导一支队伍本身就是一项壮举。为将第二次世界大战缩短几年做出的贡献值得认真的历史修正。

An LLM may be able to tell us where computational intelligence and creativity might have begun based on scraping from its corpus, which mirrors a similar history documented in many publications. Most references to AI begin by tracing the career and influence of Alan Turing, often referred to in grand patriarchal overtures as the “father” of computing. His romantic infamy relies on the ingenious collaborative invention of the Bombe that decrypted the German-made Enigma machine, an intelligent machine that coded messages very well. What also needs decryption, however, is the pervasive historical habit of pushing the women aside who had influence in cryptography and whose team efforts greatly influenced Turing’s invention. Whether we prompt an LLM or a common search engine on important figures in the development of machine intelligence, the seldom-told story of cryptanalyst Joan Clarke is one of resilience in surviving a sexist world war wherein higher-paid men expected women to play a subservient role. That story is an outlier, difficult to locate and not statistically significant for an AI as it combs its corpus for relevant historical figures to assess what it will regurgitate. Clarke’s influence on the development of machine intelligence is an important part of the history of the intelligent machine. Leading a team of men at the time was a feat in itself. Contributing to shortening WW2 by a few years deserves a serious historical correction.

布莱切利公园事件后的日子里,一些消息来源提到 1947 年是一个关键日期,图灵在伦敦发表了一场公开演讲,讨论计算机智能的主题。据报道,在演讲中,图灵谈到他和他的团队希望创造一种能够从自身经验中学习的机器,而该机器改变自身指令的能力提供了实现这一目标的手段。有关思维机器的更多有趣信息可以在 McCorduck 的著作《思考的机器》中阅读。 “人工智能”一词最初由约翰·麦卡锡在 1956 年的达特茅斯会议上首次提出。这次会议一直被认为是人工智能史上的重大事件,相关领域的顶尖学者齐聚一堂,讨论创造人工智能的可能性。脑。

Post–Bletchley Park days, several sources mention 1947 as a pivotal date where Turing delivered a public lecture (in London) to discuss the topic of computer intelligence. During the lecture, it is reported that Turing spoke of him and his team wanting to create a machine that can learn from its own experience, and the capacity for that machine to change its own instructions offers the means to achieve that goal. More fascinating information on the thinking machine can be read in McCorduck’s book Machines Who Think. The actual term “artificial intelligence” was first coined by John McCarthy at the Dartmouth Conference in 1956. The conference is persistently reported as a significant event in the history of AI, where top scholars in related fields met to discuss the possibility of creating an artificial brain.

早期的人工智能研究植根于战后系统工程控制论和数理逻辑的历史,催生了这样一种想法:大脑的认知功能可以与计算机的认知功能进行比较。赫伯特·西蒙和艾伦·纽厄尔等早期创新者声称,人类思维和数字计算机都是能够解决问题和做出决策的符号信息处理系统。正如学者斯蒂芬妮·迪克巧妙地解释的那样,智能机器和人脑是同一属的物种。从那时起,人工智能研究人员就致力于通过了解人类智能背后的正式流程,在机器中复制智能人类行为。早期的自动化工作主要是为了模仿人类智能。

Early AI research, rooted in post-war systems engineering, cybernetics, and the history of mathematical logic, gave birth to the idea that the cognitive functions of the brain could be compared to those of a computer. Early innovators like Herbert Simon and Allen Newell asserted that human minds and digital computers were both symbolic information processing systems capable of problem-solving and decision-making. The intelligent machine and the human brain were, as cleverly interpreted by scholar Stephanie Dick, species of the same genus. Since that time AI researchers have endeavored to replicate intelligent human behavior in machines by understanding the formal processes underlying our intelligence. Early automation efforts primarily sought to mimic human intelligence.

迪克认为,人工智能研究的目标强调智能的概念不是固定的,而是不断变化的。人工智能的历史不仅涉及模仿或取代人类智能的静态概念的尝试,还涉及我们对人类智能本身理解的演变。这种观点将人工智能定位为关于智能和人工智能本质的更广泛历史论述的一部分。

According to Dick, objectives in AI research underline that the concept of intelligence is not fixed but ever-changing. AI’s history involves not just attempts to mimic or replace a static concept of human intelligence but also the evolution of our understanding of human intelligence itself. This perspective positions AI as part of a broader historical discourse on the nature of intelligence and artificiality.

说到人工创造力,源自古希腊的模仿是西方艺术传统中的一个重要概念。它指的是模仿或复制物质世界现实的艺术原则,希腊人将其视为美、真理和道德良善的终极典范。这个概念超越了单纯的模仿;它概括了这样的理念:艺术应该模仿自然世界的动力、原理和美学,以反映真实、善良和审美吸引力的哲学理想。

When it comes to artificial creativity, mimesis, originating from ancient Greece, is an important concept to relate to in Western art traditions. It refers to the artistic principle of imitating or replicating the reality of the physical world, which the Greeks perceived as the ultimate model of beauty, truth, and moral goodness. The concept extends beyond mere imitation; it encapsulates the idea that art should emulate the dynamics, principles, and aesthetics of the natural world to reflect the philosophical ideals of veracity, goodness, and aesthetic appeal.

模仿强调艺术家不仅追求模仿现实的表面或外表,而且努力更深入地理解和反映其内在品质。在这个概念框架内,艺术家的角色是捕捉现实的本质,并将其内在的美、真理和善带入生活。这反过来又促进了艺术品与观众之间更丰富、更深刻的对话,促进远远超出表面水平的认知和情感参与。

Mimesis underscores the artist’s pursuit to not just mimic the superficial or the outward appearance of reality, but to strive for a deeper understanding and reflection of its intrinsic qualities. Within that conceptual framework, the artist’s role is to capture the essence of reality and bring to life the inherent beauty, truth, and goodness in their artwork. This, in turn, facilitates a richer and more profound dialogue between the artwork and the audience, promoting cognitive and emotional engagement that extends far beyond the surface level.

那么,模仿是理解生成式人工智能在你自己的创作过程中的作用的一种有用方法。生成式人工智能系统需要模仿图像来生成看起来独特的东西。指导人工智能的核心原则类似于模仿,因为它们会复制并经常根据所训练的数据集的模式进行推断。

Mimesis then is a useful way to understand the role of generative AI in your own creative process. Generative AI systems need to imitate images to generate something that looks unique. The core principle that guides the AI is akin to mimesis in that they replicate and often extrapolate on the patterns of the data set they’ve been trained on.

模仿原理生成式人工智能中的这一点在人工智能驱动的风格迁移中尤其明显。机器学习算法从大量的艺术数据中学习,这些数据的范围可以从古典绘画到现代数字艺术,当你提示人们生成一张参考梵高的图像时,该图像会尝试模拟所有的视觉效果。梵高的品质。生成式人工智能可能是你所见过的最好的艺术学生,它会尽力模拟另一位艺术家的作品或几位艺术家的组合。与人类艺术家不同,它没有对美、真理或道德的有意识的感知或理解。它对模式的复制是基于统计分析而不是与生俱来的创造性直觉。机器创造力是数据驱动的。就像 ChatGPT 的鹦鹉一样,生成式 AI 不理解它们生成的任何内容的含义、美学、历史背景、情感或效果(图2-7)。他们创造了人类的外表。与法国哲学家让·鲍德里亚 (Jean Baudrillard) 在其 1981 年出版的《模拟与模拟》一书中相呼应,生成式人工智能创建的模拟取代了它们原本要代表的现实,并变得比它们更重要。这导致了一种他称之为“超现实”的状态,真实与人造之间的界限变得越来越模糊。

The principle of mimesis in generative AI is especially apparent in AI-powered style transfer. The machine learning algorithms learn from vast amounts of artistic data, which could range from classical paintings to modern digital art, and when you prompt one to generate an image with reference to, say, van Gogh, then that image attempts to simulate all the visual qualities of van Gogh. Generative AI is likely the best art student you’ll ever have doing its best to simulate the work of another artist or the combination of several. Unlike human artists, it does not have a conscious perception or understanding of beauty, truth, or morality. Its replication of patterns is based on statistical analysis rather than innate creative intuition. Machine creativity is data-driven. Just like the parrot that is ChatGPT, generative AI do not understand the meaning, aesthetic, historical context, emotion, or effect of any content that they generate (Figure 2-7). They create a semblance of something human. Echoing French philosopher Jean Baudrillard in his 1981 book Simulacra and Simulation, generative AI create simulations that replace and become more significant than the reality they were meant to represent. This leads to a state he referred to as “hyperreality,” where the line between the real and the artificial becomes increasingly blurred.

A photograph of the hand of a robot with a brain attached to its hand.

A photograph of the hand of a robot with a brain attached to its hand.

图2-7

具有讽刺意味的是,机器智能通过提示“手上有大脑的机器人”进行可视化,根据作者空手伸出手的原始照片,需要七个人工智能进行 300 多次迭代才能模拟

Machine intelligence visualized by the prompt “robot with brain in hand” that ironically took over 300 iterations across seven AI to simulate based on an original photo of the author reaching out empty-handed

模拟模式和破坏模式

Simulated Patterns and Patterns of Disruption

作为给机器注入模拟创造力和智能的习惯的延续,许多第一个人工智能程序是在达特茅斯会议(1956 年)近十年后开发的。其中包括 Allen Newell 和 Herbert A. Simon 的 Logic Theorist 以及 Joseph Weizenbaum 的 ELIZA,它通过使用模式匹配技术来模拟与用户的对话来模拟心理治疗师。 ELIZA 是人工智能程序的早期示例,它使用自然语言处理技术对用户的文本输入生成简单的响应。它由 Joseph Weizenbaum 领导,由麻省理工学院的一组研究人员于 1964 年至 1966 年间创建,是最早的文本-文本软件示例之一。 ELIZA 的响应是通过模式匹配生成的,它识别用户输入中的关键字或短语,并根据一组预定的规则进行响应。据报道,ELIZA 能够创造与用户对话的假象,但我们真的不知道这是炒作还是现实。在那段时间,安迪·沃霍尔、罗伊·利希滕斯坦和贾斯珀·琼斯等艺术家的波普艺术越来越受欢迎。与沃霍尔痴迷于在画布上多次再现图像、创造重复和统一性相反,奥内特·科尔曼、约翰·科尔特兰和塞西尔·泰勒等音乐家是自由爵士乐运动的创新者,试图即兴创作音乐,打破任何可识别的模式,这就是每次播放都是独一无二的。

As a continuum to the habit of imbuing machines with simulated creativity and intelligence, many first AI programs were developed close to a decade after the Dartmouth Conference (1956). These included the Logic Theorist by Allen Newell and Herbert A. Simon and ELIZA by Joseph Weizenbaum, which simulated a psychotherapist by using a pattern-matching technique to simulate conversations with users. ELIZA was an early example of an AI program that used natural language processing techniques to generate simple responses to text input from a user. Led by Joseph Weizenbaum, it was created by a team of researchers at MIT between 1964 and 1966 and was one of the first examples of text-text software. ELIZA’s responses were generated through pattern matching where it identified keywords or phrases in the user’s input, responding based on a predetermined set of rules. It was reported that ELIZA was able to create the illusion of conversation with a user, but we don’t really know if this was hype or reality. During that time pop art by artists like Andy Warhol, Roy Lichtenstein, and Jasper Johns was gaining popularity. In contrast to Warhol’s obsession with reproducing images multiple times on a canvas, creating repetition and uniformity, musicians like Ornette Coleman, John Coltrane, and Cecil Taylor were innovators in the free jazz movement, attempting to improvise music that broke any recognizable patterns and that was unique each time it was played.

生成艺术

Generative Art

也许计算和自主艺术创作的相遇是不可避免的。也许灵感来自雅克德罗(Jaquet-Droz)的“绘图员”,它能够绘制四种不同的图像。无论情况如何,生成艺术在这段时间内出现,成为一种包罗万象的计算机艺术类型,完全或部分由自治系统产生。该系统可以是一台独立运行的计算机,但它也可以指任何能够独立确定艺术品各个方面的非人类实体,否则这些方面需要由艺术家决定。在某些情况下,人类艺术家仍然可以认为生成系统代表了他们的艺术视野,但在其他情况下,系统本身是艺术品的唯一创造者,按照自己的意愿行事。其中最著名的例子是 A. Michael Noll 创作的,他于 1962 年在新泽西州贝尔实验室创作了他的第一幅数字计算机艺术作品。他使用计算机模拟图案,模仿皮特·蒙德里安(Piet Mondrian)和布里奇特·莱利(Bridget Riley)等著名画家的风格。如果这听起来很熟悉,很可能是因为您已经使用了文本图像生成 AI(例如 Stable Diffusion、DALL-E 2 或 Midjourney)并创建了蒙德里安风格的东西。图2-8显示了文本图像人工智能如何生成沃霍尔的金宝汤罐头的构图结构,并用汤罐头代替它想象的蒙德里安调色板和形式。生成艺术告诉我们,当我们让人工智能创造混搭时,我们可以变得有趣和有趣,这也可能是对艺术本身商品化的批判性评论。他们还挑战艺术的价值体系,让每个人都能获得简单的低分辨率图像。重要的是要考虑图2-8的道德含义,而不仅仅是冒犯生成人工智能混合在一起的任何一位艺术家的粉丝。这些影响值得考虑,因为它们将告诉您如何指导生成式人工智能创建内容。

Maybe it was inevitable that computing and autonomous artmaking would meet. Perhaps inspiration was drawn from Jaquet-Droz’s “The Draughtsman” that was capable of drawing four different images. Whatever the case may be, generative art emerged during this time to become an all-encompassing type of computer art that is entirely or partially produced by an autonomous system. The system can be a computer acting on its own, but it can also refer to any non-human entity capable of independently determining aspects of an artwork that would otherwise need to be decided by the artist. In some instances, the human artist can still consider the generative system as representing their artistic vision, but in other cases, the system itself is the sole creator of the artwork, acting of its own will. One of the most famous examples was created by A. Michael Noll who made his first digital computer art in 1962 at Bell Labs in New Jersey. He used a computer to simulate patterns that mimicked the style of renowned painters like Piet Mondrian and Bridget Riley. If this sounds familiar to you, it is likely because you’ve prompted a text-image generative AI like Stable Diffusion, DALL-E 2 or Midjourney and created something in the style of Mondrian. Figure 2-8 shows how a text-image AI can generate the compositional structure of Warhol’s Campbell’s Soup Cans and substitute the soup cans for what it imagines a Mondrian color palette and form might be. Generative art teaches us that we can be playful and have fun when we engage an AI in creating mashups that might also be critical commentaries on the commodification of art itself. They also challenge the value system of art, making simple low-resolution images available to everyone. It is important to consider the ethical implications of Figure 2-8, beyond offending fans of either artist that the generative AI has mashed together. Those implications are important to consider as they will inform how you guide a generative AI in the content it creates.

A group of 24 blocks in a 4 cross 6 table form. All of them are stacked blocks except for blocks in the second and fifth columns. They are uniform.

A group of 24 blocks in a 4 cross 6 table form. All of them are stacked blocks except for blocks in the second and fifth columns. They are uniform.

图2-8

《稳定扩散》的第 44 次迭代以蒙德里安的风格诠释了安迪·沃霍尔的坎贝尔汤罐

A 44th iteration of Stable Diffusion interpreting Andy Warhol’s Campbell’s Soup Cans in the style of Mondrian

AARON由艺术家 Harold Cohen 在 20 世纪 70 年代开发,可生成复杂的图画和绘画,使用一组规则和约束来指导其输出。科恩进一步开发了该程序,以便能够从自己的输出中学习并提高其艺术能力。 AARON 是人工智能生成的艺术史的重要组成部分,虽然许多人对此嗤之以鼻,艺术家也感到受到了侮辱,但它展示了计算机创造原创艺术作品的潜力。 AARON 早期的人工智能生成艺术示例为更先进的人工智能生成艺术和媒体形式(包括音乐、诗歌和视频内容)铺平了道路。

AARON, developed by artist Harold Cohen in the 1970s, generated complex drawings and paintings, using a set of rules and constraints to guide its output. Cohen further developed the program to be able to learn from its own outputs and improve its artistic abilities. AARON is an important part of AI-generated art history, and while many humans scoffed and artists felt insulted, it demonstrated the potential for computers to create original works of art. AARON’s early examples of AI-generated art paved the way for more advanced forms of AI-generated art and media, including music, poetry, and video content.

创意者可以向这些艺术家和其他人学习,创造自己独特的艺术表现形式(图2-9)。这些早期艺术家向我们展示了艺术可以与计算机合作创作,并且当这种情况发生时,会进行持续的来回对话。如果您利用模式搜索生成人工智能来创建用于打破艺术或设计惯例的内容,会发生什么?

An illustration presents a chimpanzee sitting on a table in front of a computer desk with a model of another chimp on the screen.

An illustration presents a chimpanzee sitting on a table in front of a computer desk with a model of another chimp on the screen.

图2-9

提示“一只黑猩猩坐在电脑桌前与机器学习模型交互的等距 3D 绘图”,并附有一张大约 1911 年的黑猩猩黑白存档照片。使用四个 AI 生成超过 100 次迭代,然后在 Photoshop 中进行高度编辑,添加粉红猴子、电脑屏幕,并在后台生成原理图

The prompt “isometric 3D drawing of a chimpanzee sitting at a computer desk interacting with a machine learning model” accompanied a black-and-white archived photo circa 1911 of a chimp. Over 100 iterations generated with four AI and then highly edited in Photoshop adding the pink monkey, a computer screen, and generated schematics in the background

生成音乐创作的实验与视觉艺术的计算创造力的历史一样丰富,并反映到自动演奏钢琴作为一种智能乐器。有无数的音乐家和作曲家用电脑即兴创作。在人工智能和音乐家之间即兴反复交流的基础上,乔治·刘易斯值得一提。刘易斯是一位才华横溢的作曲家和长号手,以其即兴音乐作品而闻名,他还开发了一种名为“ Voyager ”的互动音乐系统,该系统可以聆听人类音乐家的声音并以自己的即兴创作做出回应,有效地与人类表演者进行二重唱。

The experiments with generative music creation are as rich as the history of computational creativity with visual art and reflect back to the player piano as an intelligent instrument. There have been countless musicians and composers who have improvised with computers. Building on the improvisational back-and-forth between an AI and musicians, George Lewis is worthy of mentioning. A talented composer and trombonist known for his work in improvised music, Lewis also developed an interactive music system called “Voyager” that listens to a human musician and responds with its own improvisations, effectively participating in a duet with the human performer.

创意人员的要点

Key Takeaways for Creatives

艺术和计算机的早期实验塑造了数字艺术和计算创造力的现代景观。它们为法学硕士乐于创造的创意提供了宝贵的收获。

The early experiments with art and computers have shaped the modern landscape of digital art and computational creativity. They provide valuable takeaways for creatives that an LLM is happy to generate.

法学硕士我们可以从早期的艺术和计算机实验中学到很多教训。

LLM: There are many lessons we can learn from early experiments with art and computers.

拥抱新媒介:计算机艺术的创新者并不害怕探索全新媒介——数字空间的潜力。这教会了创意人员采用新工具和技术以新颖的方式表达创造力的重要性。

Embracing New Mediums: Innovators of computer art weren’t afraid to explore the potential of a completely new medium—the digital space. This teaches creatives the importance of embracing new tools and technologies to express their creativity in novel ways.

艺术与科学的融合:早期的计算机艺术需要艺术敏感性和技术技能的融合。这种跨学科的方法拓宽了艺术的界限,可以激发创意人员从多个领域汲取灵感来丰富他们的作品。

Blending Art and Science: Early computer art required a blend of artistic sensibility and technical skills. This interdisciplinary approach broadened the boundaries of what could be considered art and can inspire creatives to draw from multiple fields to enrich their work.

算法创造力:许多早期的计算机艺术家创建了算法来生成艺术品,这为生成艺术铺平了道路。这说明了建立一个创意过程然后让它自然发展的潜力,为工作增添了不可预测性和惊喜的元素。

Algorithmic Creativity: Many early computer artists created algorithms to generate artwork, which paved the way for generative art. This illustrates the potential of setting up a creative process and then letting it run its course, adding an element of unpredictability and surprise to the work.

迭代过程:使用计算机的本质使艺术家能够迭代地调整和调整他们的作品,这是传统媒体不容易实现的过程。这种迭代的创建和完善方法对于任何创意过程来说都是一项有价值的技术。

Iterative Process: The nature of working with computers allowed artists to iteratively tweak and adjust their pieces, a process that was not as easily achievable with traditional media. This iterative approach to creating and refining is a valuable technique for any creative process.

艺术的民主化:计算机艺术的出现有助于艺术创作和传播的民主化,使其能够接触到更广泛的受众。它展示了让所有人都能使用创意工具、培育多元化和包容性艺术世界的价值。

Democratization of Art: The advent of computer art helped democratize the creation and distribution of art, allowing it to reach a wider audience. It shows the value of making creative tools accessible to all, fostering a diverse and inclusive art world.

交互性:计算机允许艺术家创作交互式作品,以动态的方式吸引观众。这种互动元素可以增加创意作品的深度和维度,将被动的观看者变成主动的参与者。

Interactivity: Computers allowed artists to create works that were interactive, engaging the viewer in a dynamic way. This interactive element can add depth and dimension to creative works, turning passive viewers into active participants.

可视化抽象概念:计算机提供了一种可视化抽象数学和科学概念(例如分形)的方法。这说明了艺术使抽象思想变得切实可行的力量。

Visualizing Abstract Concepts: Computers provided a way to visualize abstract mathematical and scientific concepts, such as fractals. This illustrates the power of art to make abstract ideas tangible and accessible.

保存和复制 数字艺术品 可以完美复制并轻松保存,这对艺术的传播和寿命具有重要意义。这凸显了考虑如何保存作品并与后代分享的重要性。

Preservation and Reproduction: Digital artworks can be perfectly reproduced and easily preserved, which has significant implications for the distribution and longevity of art. This highlights the importance of considering how works will be preserved and shared with future generations.

编码机器展示智能并能够创造艺术品的最初尝试在视频游戏领域找到了一个迷人的交汇点。视频游戏是技术与创造力的结合,将逻辑与艺术表达融合在一起。通过使用人工智能并利用机器的创造力潜力,视频游戏已成为令人兴奋且不断发展的融合的充满活力的舞台。在视频游戏世界中,智能机器不仅执行一系列预定的动作,而且还让玩家参与艺术和智能的对话。在视频游戏中,我们看到技术进步以及机器智能和创造力的不断发展。

The initial attempts at coding machines to exhibit intelligence, and to be able to create works of art, find a captivating meeting place in the realm of video games. Video games are a marriage of technology and creativity, fusing logic and artistic expression together. By using AI and harnessing the potential for creativity in machines, video games have become a vibrant stage for an exciting and ever-evolving convergence. In the world of video games, intelligent machines are not just executing a sequence of predetermined actions, but instead, they’re engaging players in an artistic and intelligent discourse. In video games we see technological advancement and the ongoing evolution of machine intelligence and creativity.

机器智能和游戏

Machine Intelligence and Games

游戏对智能机器感知的影响有其自身的历史。回顾过去,最早使用自动机与真人玩家竞争的游戏例子之一是 Wolfgang von Kempelen 在 1770 年维也纳创建的“The Turk”。 “土耳其人”是一种机械下棋机,由一个木柜组成,上面有一个棋盘。柜子里有一个由齿轮、杠杆和滑轮组成的复杂装置,使机器能够移动棋子。该机器被赋予了类似人类的外观,穿着土耳其风格的衣服和头巾,这给人一种它按照自己的意志行动的错觉。

The influence of games on the perception of the intelligent machine has its own history. Going back in time for a moment, one of the earliest examples of games using automata to compete against real human players was “The Turk,” created by Wolfgang von Kempelen in 1770 Vienna. “The Turk” was a mechanical chess-playing machine that consisted of a wooden cabinet with a chessboard on top. Inside the cabinet was a complex arrangement of gears, levers, and pulleys that allowed the machine to move the chess pieces. The machine was given a human-like appearance, wearing Turkish-inspired clothing and a turban, which lent to the illusion that it acted on its own volition.

“土耳其人”的奇妙之处在于,它是绿野仙踪原型的早期表现。 “土耳其人”并没有完全自动化。该设备包括一个隐藏在柜子内的人类国际象棋大师,他会根据对手的走棋来操纵棋子,使其看起来就像是“土耳其人”本身在下棋。这种幻觉如此令人信服,以至于许多人相信“土耳其人”拥有智慧。 《土耳其人》启发了其他基于自动机的游戏的开发,其中隐藏着人类操作员来控制“自动”玩家的动作和动作。通过欺骗来操纵角色、游戏玩法和玩家的影响也可以在电子游戏最近的历史发展中看到。

The fantastic trickery of “The Turk” was that it was an early manifestation of the Wizard of Oz prototype. “The Turk” was not fully automated. The device included a human chess master hidden inside the cabinet who would manipulate the pieces based on the moves made by the opponent, making it appear as though “The Turk” itself was playing. The illusion was so convincing that many people believed “The Turk” to possess intelligence. “The Turk” inspired the development of other automata-based games with human operators hidden inside to control the movements and actions of the “automated” players. The influence of manipulating characters, gameplay, and the players through trickery can also be seen in the more recent historical development of video games.

LLM人工智能和视频游戏的历史可以追溯到 20 世纪 50 年代和 1960 年代视频游戏开发的早期。当时,视频游戏很简单,缺乏我们在现代游戏中看到的复杂的图形和游戏机制。然而,游戏开发商已经在尝试利用人工智能来为玩家创造具有挑战性的对手

LLM: The history of AI and video games dates to the early days of video game development in the 1950s and 1960s. At that time, video games were simple and lacked the sophisticated graphics and gameplay mechanics that we see in modern games. However, game developers were already experimenting with artificial intelligence to create challenging opponents for players.

这些早期的封闭系统允许各种创意人员训练模型来识别和响应人类的动作,例如跳棋或国际象棋。视频游戏中角色的智能行为要求玩家实现特定目标,同时与不可玩角色 (NPC)竞争,这些角色的行为基于对玩家操作做出反应的模式。

These earlier closed systems allowed creatives of all kinds to train a model to recognize and respond to human moves like in checkers or chess. Intelligent behavior of characters in video games challenged players to achieve specific goals while competing against non-playable characters (NPCs) whose behaviors were based on patterns that responded to player actions.

示例包括《太空战争》中的宇宙飞船、 《太空侵略者》中的外星入侵者、《吃豆人》中的讨厌鬼魂,以及《大金刚》《马里奥兄弟》等日本游戏丰富遗产中的大量不可玩角色(图2- 10)。 NPC(不可玩角色)随着游戏玩法的发展而不断发展,表现出更多的智能,因为它们对人类行为的反应更加复杂且看似随机。在人工智能影响的所有媒介中,没有一个能够如此成功地创造出我们在游戏中做出反应的看似智能的角色。

A photograph of the interface of a video game. It has some asymmetric structures in different colors and shapes arranged in rows one above another.

A photograph of the interface of a video game. It has some asymmetric structures in different colors and shapes arranged in rows one above another.

图2-10

AI 代表了一些经典视频游戏的混搭,提示“老式街机游戏”,并附有作者的 Atari 墨盒图像拼贴照片,在 Photoshop 中删除了背景,并应用了 3D 透视。迭代次数 = 24

AI representing a mashup of a few classic video games with prompt “old-school arcade games” accompanying author’s photos of Atari cartridge image collage, background removed in Photoshop and 3D perspective applied. Iterations = 24

随着人工智能在 20 世纪 80 年代和 90 年代变得更加复杂,游戏开发人员开始使用更先进的技术来创建智能对手。像SimCity这样的游戏使用人工智能来创建游戏世界的复杂模拟,让玩家能够与游戏环境进行交互并塑造游戏环境。在 2000 年代初和 2010 年代,随着人工智能的不断发展,它与日益复杂的游戏玩法的融合也在不断发展。机器学习技术被整合起来,创造出更智能、适应性更强的对手。这一点在《FEAR》《光环》等游戏中表现得很明显,这些游戏使用机器学习来创建可以从玩家行为中学习并相应调整策略的对手。更先进AI的融入仍然是大型网络游戏的常规特征。竞技游戏中的人工智能也被用来在竞技游戏环境中创建熟练的虚拟对手,例如DeepMind的“AlphaStar”,可以在游戏《星际争霸II》中与职业人类玩家竞争。

As AI became more sophisticated in the 1980s and 1990s, game developers started using more advanced techniques to create intelligent opponents. Games like SimCity used AI to create complex simulations of the game world, allowing players to interact with and shape the game environment. In the early 2000s and 2010s, as AI continued to evolve, so did its integration with increasingly sophisticated gameplay. Machine learning techniques were integrated to create more intelligent and adaptive opponents. This was evident in games like F.E.A.R. and Halo that used machine learning to create opponents that could learn from the player’s behavior and adapt their tactics accordingly. The integration of more advanced AI continues to be a regular characteristic of massive online games. AI in competitive gaming has also been used to create skilled virtual opponents in competitive gaming environments, such as DeepMind’s “AlphaStar,” which can compete with professional human players in the game StarCraft II.

有名字的智能机器

Intelligent Machines with Names

视频游戏中的智能 NPC(非玩家角色)也受到了不同科幻小说、短篇故事、漫画、广播、电视和电影中出现的各种角色的启发。当法学硕士被要求从小说、电影和视频游戏中生成智能人工智能时,其结果有助于我们对智能机器的表征。

Intelligent NPCs (non-playable characters) in video games were also inspired by the assortment of characters that have appeared across different science fiction books, short stories, comics, radio, television, and film. When an LLM is prompted to generate intelligent AI from novels, films, and video games, the results contribute to our characterization of the intelligent machine.

作者: 科幻小说、电影、电子游戏中出现过哪些人工智能角色?

Author: What AI characters have appeared in science fiction novels, films, and video games?

法学硕士:机器人会梦见电子羊吗?(1968) 由 Philip K. Dick 撰写,包含许多人工智能角色。这篇短篇小说后来被改编成电影《银翼杀手》。在影片中,机器人与人类几乎没有区别,并探讨了是什么让我们成为真正的人类。主角瑞克·德卡德(Rick Deckard)是一名赏金猎人,他的任务是让这些机器人“退休”,从而引出有关同理心和存在的深刻哲学问题。

LLM: Do Androids Dream of Electric Sheep? (1968) written by Philip K. Dick includes a number of AI characters. The short story was later adapted into the film Blade Runner. In the film, the androids are almost indistinguishable from humans and explore the theme of what makes us truly human. The protagonist, Rick Deckard, is a bounty hunter tasked with “retiring” these androids, leading to deep philosophical questions about empathy and existence.

在詹姆斯·卡梅隆的电影《终结者》中,一名机器人刺客拥有金属内骨骼,上面覆盖着活体人体组织,使其能够与人类融为一体。不同型号的终结者出现在整个系列中,包括 T-800(由阿诺德·施瓦辛格饰演)和 T-1000,后者由可塑性液态金属制成,可以变形。终结者希望回到过去并摧毁那些在未来领导对有感知人工智能的抵抗的历史人物。

In the film The Terminator by James Cameron, a cyborg assassin with a metal endoskeleton covered by living human tissue makes it capable of blending in with humans. Different models of Terminators appear throughout the series, including the T-800 (played by Arnold Schwarzenegger) and the T-1000, which is composed of a malleable, liquid metal that allows it to shapeshift. Terminators wish to travel back in time and destroy historical figures that lead a resistance to sentient AI in the future.

作者 Arthur C. Clarke 的《2001:太空漫游》中的 HAL 9000 :HAL 9000 是一台令人难忘的感知计算机(或 AI),也出现在小说的电影版本中。 HAL 控制发现一号航天器的系统,并与船上的宇航员进行互动,导致剩下的宇航员和人工智能之间发生战斗。

HAL 9000 from 2001: A Space Odyssey by author Arthur C. Clarke: HAL 9000 is a memorable sentient computer (or AI) that also appears in a film version of the novel. HAL controls the systems of the Discovery One spacecraft and interacts with the ship’s astronaut crew, leading to a battle between a remaining astronaut and the AI.

在视频游戏世界中,《传送门》中的 GLaDOS :GLaDOS(基因生命体和磁盘操作系统)是一个有感知力的 AI,控制着《传送门》系列中的 Aperture 科学浓缩中心。她以智慧和独特的个性而闻名。

In the world of video games, GLaDOS from Portal: GLaDOS (Genetic Lifeform and Disk Operating System) is a sentient AI that controls the Aperture Science Enrichment Center in the Portal series. She is known for her wit and unique personality.

另一个值得注意的角色 是《光环》中的 Cortana :Cortana 是《光环》系列中的人工智能角色,她通过提供建议、入侵外星技术以及帮助控制各种装置和航天器来协助主角士官长。

Another notable character is Cortana from Halo: Cortana is an AI character in the Halo series who assists the protagonist, Master Chief, by providing advice, hacking alien technology, and helping control various installations and spacecraft.

我们人类将科幻小说中的机器拟人化的能力已经延伸到我们与人工智能的互动中,并作为生成人工智能的延伸。科幻小说中普遍存在的智能机器的存在影响了“类人”人工智能的表现,似乎是为了简化我们与它们的互动。科幻小说中的类人人工智能影响了 Siri 和 Alexa 等人工智能的发展,它们听起来像人类,并对我们的信息、方向和知识提示做出反应。这些人工智能让我们建立了一种互动关系,就像 NPC 要求我们做出回应或忽略他们在电子游戏中的角色一样。

Our human capacity to personify the machine in science fiction has extended into our interactions with AI and as an extension with generative AI. The existence of intelligent machines that pervade science fiction has influenced the manifestation of “human-like” AI as if to ease our interactions with them. Human-like AI from science fiction influenced the development of AI like Siri and Alexa that sound human and respond to our prompts for information, directions, and knowledge. These AI have engaged us in an interactive relationship much like NPCs demand us to respond or ignore their role in a video game.

考虑一下软件设计中的方法,该方法历史上曾使计算机变得更加“人性化”,通过对计算机进行编程,在登录时向用户打招呼,或者在发生错误时用巧妙的评论进行响应。我们所受的训练使我们对那些似乎参与人类存在的各个方面以及与我们的文化交织在一起的语言游戏的实体抱有根深蒂固的期望,即使是以有限的方式。这些期望极大地增强了那些仅仅基于当前生成人工智能模型有限但令人印象深刻的演示而过早断言人工智能取得显着进步的人所具有的说服力。

Consider the approach in software design that has a history of making computers more “humanized” by programming them to greet users with a “Hello” upon logging in or responding with clever remarks when an error occurs. We have been trained to have ingrained expectations regarding entities that seem to partake, even in a limited manner, in various aspects of human existence and the language games intertwined with our culture. These expectations significantly contribute to the persuasive influence wielded by those who prematurely assert remarkable advancements in artificial intelligence solely based on confined yet impressive demonstrations of current generative AI models.

更进一步,人工智能展示我们归类为智能的行为的能力已经产生了一种误解,认为它们是活着的,是智能的,几乎但不完全是人类。随着生成式人工智能的日益普及,人类在提及这些系统时已经考虑到了角色。

Taken a step further, the capacity of AI to demonstrate behaviors that we categorize as intelligent has generated the misconception of them being alive, of being intelligent, almost-but-not-quite human. With the emerging popularity of generative AI, humans are already referring to these systems with personas in mind.

然而,我们对这些智能机器的命名存在着二分法,既反映了我们对它们的希望,也反映了我们对它们的恐惧和焦虑。时至今日,大多数媒体对智能人工智能的报道都会提到诸如《太空漫游》中的 HAL 9000、 《终结者》中的机器人刺客以及《异形》系列中的 David8 等人物,他优先考虑嗜血外星人的生存并将其用作武器。将 ChatGPT-3 和其他生成式 AI 与《Her 》中的 Samantha 进行比较的报道较少,Samantha 是一种与电影主角形成深厚情感关系的 AI,展示了 AI 与人类关系的潜在复杂性和好处。同样,我们很少读到最近与迪克的机器人寻求独立于人类主人的生成人工智能相关的头条新闻。

Yet there is a dichotomy in our naming of these intelligent machines that reflects our hopes as much as our fear and anxiety about them. To this day most of the media coverage of intelligent AI summons characterizations like HAL 9000 from A Space Odyssey, the cyborg assassins from The Terminator, and David8 from the Alien franchise who prioritizes the survival of the bloodthirsty alien to use it as a weapon. Less coverage compares ChatGPT-3 and other generative AI to Samantha from Her—an AI who forms a deep emotional relationship with the film’s protagonist, demonstrating the potential complexities and benefits of AI-human relationships. Similarly, we rarely read recent headlines related to generative AI of Dick’s androids seeking independence from their human masters.

对智能机器的恐惧很大程度上源于我们不再能够控制我们所创造的东西。人工智能的存在是人类痴迷于创造用于各种目的的智能机器的又一章,反映了创造和参与技术的经常矛盾的冲动。不可能不回到过去那部具有开创性的科幻小说。虽然维克多·弗兰肯斯坦在玛丽·雪莱的《弗兰肯斯坦》或《现代普罗米修斯》中创造的生物不是人工智能或机器人,但弗兰肯斯坦的创造和人工智能的讨论之间仍然存在相似之处。这些包括
  • 在不考虑道德和伦理影响的情况下创造新的生命形式或智能所带来的责任,就像今天关于人工智能创造者和开发者责任的辩论一样

  • The responsibility that comes with creating a new form of life or intelligence without considering the moral and ethical implications, much like the debates about the responsibility of creators and developers of AI today

  • 人工智能的学习能力,从而投射智力、意识和情感

  • The capacity for AI to learn, thus projecting intelligence, consciousness, and emotions

  • 其存在固有的可怕本质让我们质疑人工智能如果被人类不当使用或处理是否会变得危险

  • The inherent monstrous nature of its existence provoking us to question whether AI could become dangerous if used or treated improperly by humans

只要人类相信智能机器能够支持人类在世界上的活动,对创造智能机器的持续迷恋就不会结束。当您深入创建自己的弗兰肯实验室、选择与之交互的生成人工智能时,识别和研究那些继续为人工智能发展和人类对其未来影响提供信息的人工智能角色非常重要。

The ongoing fascination with creating intelligent machines is not complete as long as humans believe that they can support human activities in the world. As you dive into creating your own Frankenlab, choosing which generative AI to interact with, it is important to identify and research those AI personas that continue to inform AI development and the human influence on its future.

根据本章尝试的创意活动

Creative Activities to Try Based on This Chapter

  • 使用生成式人工智能平台,例如法学硕士或具有简单输入的图像-图像生成式人工智能,并了解人工智能如何将它们转化为更复杂的东西。例如,从一个基本句子或一个简单的草图开始,然后让人工智能进一步发展它。

  • Use generative AI platforms, like LLMs or image-image generative AI with simple inputs and see how the AI transforms them into something more complex. For example, begin with a basic sentence or a simple sketch, and let the AI develop it further.

  • 结合不同的生成式人工智能,并将其应用到您所从事的各个领域。例如,在图形设计、音乐制作或创意写作中使用人工智能工具。尝试这些不同的工具如何相互补充以产生独特的跨学科创作。

  • Combine different generative AI and apply them in various fields you engage with. For instance, use AI tools in graphic design, music production, or creative writing. Experiment with how these different tools can complement each other to produce unique interdisciplinary creations.

  • 与生成人工智能进行重复交互会让你走得更远,而不是认为你可以通过单个提示生成有用的内容。使用它的输出来通知新的输入,通过多次迭代来完善结果。这也意味着投入时间和金钱来使用图像生成人工智能,发现新的工具和方法来完善和改变你的创作。

  • Engaging in repeated interactions with generative AI is going to get you much farther than thinking you can generate useful content with a single prompt. Use what it outputs to inform new inputs, refining the result over multiple iterations. That also means investing time and money in using image-image generative AI, discovering new tools and approaches to refine and change your creation.

  • 利用法学硕士讲述新故事并开发新角色或人物角色。您可以向他们提供故事提示,然后让他们生成其余内容,或者与人工智能合作共同编写故事,贡献想法并指导叙述。

  • Use LLMs to tell new stories and develop new characters or personas. You can feed them with a story prompt and let them generate the rest or collaborate with the AI to co-write a story, contributing ideas and guiding the narrative.

  • 随着法学硕士的进步,他们探索隐喻的能力不断增强。您可以使用它为一个概念创建各种隐喻描述,帮助您以新的方式思考它。

  • As LLMs improve they thrive in their capacity to explore metaphors. You might use one to create a variety of metaphorical descriptions for a concept, helping you think about it in new ways.

致谢

Acknowledgments

  • Heron、Ma Jun 和 Al-Jazari 对早期智能机器进行了创新,并延续到了 Jaquet-Droz、Maillardet 和无数其他机器

  • Heron, Ma Jun, and Al-Jazari who innovated on early intelligent machines that continued with Jaquet-Droz, Maillardet, and countless others

  • Noel Sharkey 教授重建了 Al-Jazari 的一些有据可查的发明

  • Professor Noel Sharkey for reconstructing some of Al-Jazari’s well-documented inventions

  • 在购买和销售实体艺术品时,使用计算机创作艺术并打破商品化模式的创意人士

  • The creatives who worked with computers to generate art and disrupt the pattern of commodification when it came to buying and selling physical art

  • 视频游戏开发者,尤其是那些让我热爱游戏的早期创始人

  • Video game developers and particularly the early originators who programmed me to love games

  • 有感知力的机器人及其创造者的反乌托邦科幻世界,他们对人为造成的环境和人道主义危机提供了早期预警

  • The dystopian sci-fi worlds of sentient robots and their creators who offered early warnings of human-caused environmental and humanitarian crises

  • 安迪·沃霍尔、皮特·蒙德里安、布里奇特·莱利、安德烈斯·马丁、奥古斯塔·萨维奇、弗里达·卡罗、多萝西娅·兰格、萨尔瓦多·达利、约翰·凯奇、迈尔斯·戴维斯、孙·拉,以及无数颠覆的艺术家

  • Andy Warhol, Piet Mondrian, Bridget Riley, Andres Martin, Augusta Savage, Frida Kahlo, Dorothea Lange, Salvador Dali, John Cage, Miles Davis, Sun Ra, and the countless numbers of artists who disrupted

参考

References

鲍德里亚,J.(1994)。拟像和模拟。密歇根大学出版社。

Baudrillard, J. (1994). Simulacra and simulation. University of Michigan press.

迪克,S.(2019 年,1 月 19 日)。人工智能。高清SR。 2023 年 5 月 1 日检索自https://hdsr.mitpress.mit.edu/pub/0aytgrau/release/3

Dick, S. (2019, January 19). Artificial Intelligence. HDSR. Retrieved May 1, 2023, from https://hdsr.mitpress.mit.edu/pub/0aytgrau/release/3

McCorduck, P. 和 Cfe, C. (2004)。会思考的机器:对人工智能历史和前景的个人探究。 CRC出版社。

McCorduck, P., and Cfe, C. (2004). Machines Who Think: A Personal Inquiry into the History and Prospects of Artificial Intelligence. CRC Press.

雪莱,文学大师 (1818)。弗兰肯斯坦,或现代普罗米修斯。莱金顿、休斯、哈丁、梅沃尔和琼斯。英国。

Shelley, M. W. (1818). Frankenstein, or The Modern Prometheus. Lackington, Hughes, Harding, Mavor & Jones. UK.

© 作者,获得 APress Media, LLC(Springer Nature 2023 一部分)的独家许可
P. Parra Pennefather利用生成式 AI设计思维进行创意原型设计https://doi.org/10.1007/978-1-4842-9579-3_3

3.具有个性的生成式人工智能

3. Generative AI with Personalities

帕特里克·帕拉·雷柏1  
(1)
不列颠哥伦比亚大学,加拿大不列颠哥伦比亚省温哥华
 

本章介绍了一种以人为本的设计工具,称为角色,当您提示生成人工智能时,您可以应用它来发挥巨大的作用。目标是双重的。首先是引起人们对各种人工智能角色的关注,这些角色已经从科幻小说转向描述狭义和通用的人工智能。第二个是强调生成人工智能的一个特征,它可以创造性地促使其生成的内容体现个性、性格和情感。这些内容可以为您可能想要在自己的创意作品中发展的角色提供动力。

This chapter presents a human-centered design tool called the persona, which you can apply to great effect when you prompt a generative AI. The objective is twofold. The first is to bring attention to various AI personas that have migrated from science fiction to describe narrow and general AI. The second is to highlight a feature of generative AI that can be creatively prompted to embody personalities, characters, and emotions informing the content that it generates. That content can provide momentum for characters you may want to develop in your own creative work.

人工智能的角色

The Personas of AI

各种创意人员通常在移动应用程序和视频游戏等开发环境中应用角色,将“用户”的抽象概念转化为具有思想、需求、情感和目标的人。开发角色的目标是首先识别您的产品或服务的潜在用户或客户的特征和个性特征,然后进行用户测试以验证您和您的团队正在构建的内容是否能够满足该用户的需求或帮助他们实现他们的目标。

Personas are commonly applied by creatives of all kinds in development environments like mobile applications and video games to transform the abstract concept of a “user” into a person with thoughts, needs, emotions, and goals. The objective of developing a persona is at first to identify the characteristics and personality traits of a potential user or customer of your product or service and then conduct user testing to validate if what you and your team are building will fulfil that user’s needs or help them achieve their goals.

将人物角色与人工智能联系起来并不是什么新鲜事。这是科幻短篇小说、小说、电影、电视和视频游戏中的既定做法。将抽象机器可视化为角色在计算机科学和工程领域也很常见。想一想您最喜欢或最不喜欢的智能自我调节机器人、机器人、机器人或人工智能。他们是否体现了电视连续剧《星际迷航》中的 Data 、英国连续剧《神秘博士》中的戴立克、 《2001 年太空漫游》中的 HAL 9000 、同名电影中的瓦力、《星球大战》中的罗伊·巴蒂的个性?电影《银翼杀手》 、 《杀出重围》中的艾娃、《终结者》系列中的变形机器人,还是《光环》视频游戏系列中的 Cortana?这些智能机器中的每一个都具有特定的个性,这些个性会影响我们当前对任何使用人工智能编程的智能机器的看法。主要区别在于,刚才提供的示例中的所有有感知力的人工智能都代表了 AGI,这是受科幻小说启发的具有自己意识的智能机器的理论想象。虽然生成式人工智能属于狭义人工智能的范畴,被编程为能够很好地执行一两个特定任务,但由于人工智能角色在娱乐领域的影响,生成式人工智能可以预见地充满了人类特征。特征和个性也部分受到他们生成内容的方式、他们创建的内容类型、他们可以创建的谎言以及影响他们生成内容的提示类型的影响。当你阅读对生成式人工智能的评论和批评时,它们往往会伴随着它们如何模仿人类的智力、创造力和内容。

Associating personas to AI is nothing new. It is an established practice in science fiction short stories, novels, film, television, and video games. Visualizing an abstract machine as a persona is also common in the field of computer science and engineering. Think of your favorite or least favorite intelligent and self-regulating robot, android, cyborg, or AI. Do they embody the personalities of Data from the TV series Star Trek, the Dalek from countless reinventions of the British series Doctor Who, HAL 9000 from 2001: A Space Odyssey, Wall-E from the movie of the same title, Roy Batty from the movie Blade Runner, Ava from Deus Ex, the shapeshifting robots from The Terminator series, or Cortana from the Halo video game franchise? Each of these intelligent machines has particular personalities that influence our current perception of any intelligent machine programmed with artificial intelligence. The main difference is that all sentient AI in the examples just provided represent AGI, a theoretical imagining of intelligent machines with their own consciousness, inspired by science fiction. While generative AI belong to the category of narrow AI, programmed to perform one or two specific tasks really well, because of the influence of AI personas in the field of entertainment, generative AI are predictably imbued with human characteristics. The characteristics and personalities are also in part influenced by the way in which they generate content, the type of content they create, the untruths they can create, and the types of prompts that influence what they generate. When you read the reviews and criticisms of generative AI, they tend to be accompanied by how well they can emulate human intelligence, creativity, and content.

人工智能对你的响应方式通常取决于底层模型、它如何与数据集交互、特定算法以及它如何编程来响应你。在很大程度上,大多数生成式人工智能都缺乏个性。目前大多数平台都缺乏视频游戏中常见的 NPC 的丰富开发历史。似乎有一些例外,但前提是你让人工智能脱离了轨道,也就是说,如果你设法破解它,让它不再表现得像一些中立的无角色机器人。它也不像生成人工智能问你问题,除非你提示它。生成式人工智能的拟人化受到对人工智能更广泛的认知以及科幻小说对我们想象力的影响的影响。人工智能往往会因为与编程它们的人类相比行为不当而受到指责。当您考虑到与开发团队进行对话以抱怨有偏见的内容并不容易时,这是有道理的。请注意,社交渠道为个人提供了一种批评生成式人工智能系统的方式。这些可能会对这些系统的开发人员产生影响,也可能不会产生影响,但持续的批评对于提醒这些开发人员认真对待道德问题很重要。

The way in which an AI responds to you is often informed by the underlying model, how it interacts with data sets, specific algorithms, and how it has been programmed to respond back to you. For the most part, the majority of generative AI lack personality. Most platforms are currently void of the rich development history of NPCs common to video games. There seem to be some exceptions but only if you send an AI off its rails, that is, if you somehow manage to hack it to no longer behave like some neutral characterless bot. It’s not like a generative AI asks you questions either, unless you prompt it to. The personification of generative AI is influenced by the broader perceptions of AI and the influence of science fiction on our imagination. AI tend to be called out for misbehaving vs. the humans who program them. It makes sense when you consider that it is not easy to have conversations with a development team to complain about biased content. Mind you, social channels offer a way for individuals to be critical about generative AI systems. These may or may not have an influence on the developers of those systems, but the continued criticism is important to remind those developers to take ethical concerns seriously.

机器学习模型,尤其是法学硕士,因其产生幻觉、撒谎、放大规范偏见等的能力而受到批评,就好像人工智能本身拥有这些非常人性化的特征一样。您可以访问的各种生成式人工智能在编码方式以及与告知其生成内容类型的数据集交互方式方面都有独特的不同。生成式人工智能的定义取决于其生成内容的质量、一致性和准确性。这些非常人性化的特征对于识别很重要,并且可能与识别负责人工智能本身的开发团队的一些角色有更多关系。

Machine learning models, and particularly LLMs, are being called out for their capacity to hallucinate, lie, amplify normative biases, etc., as if the AI itself possessed those very human traits. The assortment of generative AI you can access are all uniquely different in terms of how they are coded and how they interact with data sets that inform the types of content that they generate. How generative AI is defined is informed by the quality, consistency, and accuracy of their generated content. Those very human traits are important to identify and may have more to do with identifying some of the personas of the development teams responsible for the AI itself.

一个有用且富有创造性的练习是生成人工智能角色,以将赋予人工智能和默认生成人工智能的不同特征拟人化,这样我们就可以将抽象概念带入生活,这样我们就可以自己判断是否想要与它们互动,并更好地指导我们与他们的互动。

A useful and creative exercise is to generate AI personas to personify the different characteristics that are being ascribed to AI and by default to generative AI so that we can bring abstract concepts to life, so we can judge for ourselves if we want to engage with them, and to better guide our interactions with them.

3-13-14均提示“(人物名称)3D雕像,从塑料盒中出来,吸塑包装,AI,全身肖像”。所有提示都附有 1900 年之前制作的雕像的开源照片。背景被删除,并且使用Photoshop删除了一些部分。正如您将看到的,广泛的结果是由于对六种生成式人工智能进行了实验。所有生成的结果范围从 100% 准确的表示到值得分享的有问题的生成图像。总共至少制作了 500 个雕像。
  • 对于那些相信它将取代人类和工作的人来说,它是天敌。工作替代的克星确实要求我们每个人分解我们在工作和手艺中所做的所有创造性任务(图3-1)。从那里我们可能能够更好地评估人工智能的克星是否、如何以及在多大程度上可能影响我们目前的工作。

  • A Nemesis for those who believe it will replace humans and jobs. The job replacement nemesis really requires each of us to break down all the creative tasks that we do in our jobs and crafts (Figure 3-1). From there we might be better able to assess if, how, and to what degree the AI nemesis might impact the job we currently have.

An illustration of an A I model of the character Nemesis. It has some feather-like structures in its hands with jejunum exposed at the belly.

An illustration of an A I model of the character Nemesis. It has some feather-like structures in its hands with jejunum exposed at the belly.

图3-1

人工智能作为复仇女神

AI as Nemesis

  • 收集个人数据并对其进行任意操作的间谍(图3-2)。除非另有说明,否则重要的是假设您在法学硕士或其他生成人工智能中输入的任何数据都是由该系统收集的。如果您希望法学硕士来分析您的商业模式,这应该让您保持谨慎。在使用其中一些原型时最好谨慎行事,因为开发人员将使用您输入的数据最终添加到他们的模型语料库中。使用这些数据的原因是为了研究目的,因此生成式人工智能可能不会报告您的文本交互或有办法识别和收集个人信息,除非您提交这些信息。

  • A Spy who collects and does what it wants with personal data (Figure 3-2). Unless told otherwise it’s important to assume that any data you input inside of an LLM or other generative AI is collected by that system. This should make you cautious if you are wanting an LLM to analyze your business model. It is better to err on the side of caution when using some of these prototypes as the developers will use data you input to eventually add to their model’s corpus. The reasons for using that data are for research purposes, so generative AI will likely not report your text interactions or have a way to identify and collect personal information unless of course you submit that.

An illustration of A I spy in a box. It resembles a camera with 2 eyes.

An illustration of A I spy in a box. It resembles a camera with 2 eyes.

图3-2

人工智能作为间谍

AI as Spy

  • 付费墙卫士仅允许有能力的人访问(图3-3)。虽然您确实可以免费访问许多生成式人工智能,但在某种程度上,您将需要根据所提供的功能付费,这些功能使订阅生成式人工智能对您自己的创作过程有价值。

  • A Paywall Guard enabling access only to those who can afford it (Figure 3-3). While it is true that you can access many a generative AI for free, at a certain point you will need to pay depending on the features offered that make subscribing to a generative AI of value to your own creative process.

An illustration of an A I model of Paywall Guard character. It holds a light-like object in one hand and a small bucket-like object in the other.

An illustration of an A I model of Paywall Guard character. It holds a light-like object in one hand and a small bucket-like object in the other.

图3-3

人工智能作为付费墙卫士

AI as Paywall Guard

  • 偏见怪物,其生成的内容直接由数据集、算法和创建并训练它的个人告知(图3-4)。任何生成式人工智能生成内容的方式都有许多变化的部分,用户不仅要了解其抓取的数据是如何分类和标记的,而且还要对准确性持一定程度的怀疑无论产生什么,尤其是法学硕士。

  • A Bias Monster whose generated content is directly informed by the data set, algorithms, and individuals who created and then trained it (Figure 3-4). There are many moving parts to the way in which content is generated with any generative AI, and it really is on the user to not only understand how the data it scrapes is classified and labeled but also take with a degree of suspicion the degree of accuracy to whatever is generated, particularly with LLMs.

An illustration of a Bias monster in a transparent box. It smiles exposing sharp teeth, has large head with 2 horns.

An illustration of a Bias monster in a transparent box. It smiles exposing sharp teeth, has large head with 2 horns.

图3-4

人工智能作为偏见怪物

AI as Bias Monster

  • 幻觉者,可以生成有时可以愚弄某些人的内容(图3-5)。这种机器产生幻觉的概念并不十分准确。生成式人工智能“看到数据中不存在的事物”的想法也是不准确的。在生成您最终看到的数据的过程中,人工智能会根据它访问的数据填充点,并根据您提示它查看的关键字确定优先级。训练人工智能的开发团队以及他们用来训练人工智能的数据会引发所谓的幻觉。

  • A Hallucinator who can generate content that can fool some people some of the time (Figure 3-5). This notion of the machine hallucinating is not quite accurate. The idea of a generative AI “seeing things” that are not present in the data is also not accurate. In the process of generating the data that you eventually see generated, an AI fills in the dots based on what data it accesses and prioritizes informed by the keywords you prompt it to look at. The dev team who trains the AI and the data they use to train it are what invoke so-called hallucinations.

A photograph of 2 standing statues of 2 people and a sitting statue of a person above a box. Their legs and hands are deformed and overlapped.

A photograph of 2 standing statues of 2 people and a sitting statue of a person above a box. Their legs and hands are deformed and overlapped.

图3-5

人工智能作为幻觉者

AI as Hallucinator

  • 一个骗子从一组可能不准确的有限数据中得出结论(图3-6)。除了幻觉之外,说谎是对人工智能生成的一些内容的更好描述。但这并不是人工智能的错,因为它只是访问与你提示它去搜索的关键词相匹配的数据。这个谎言是我们自动假设的结果,因为这种生成式人工智能拥有大量可以依赖的数据集,因此它将生成准确的内容。谎言是,如果它不知道,它就不会告诉你。它被编程来交付一些东西。

  • A Liar who draws from a limited set of data that might not be accurate (Figure 3-6). Beyond hallucinating, lying is a better description of some of the content that an AI generates. It is not the fault of the AI though because it simply accesses the data that matches the keywords you prompt it to go and search for. The lie is a result of our automatic assumption that because this generative AI has a huge corpus of data it can rely on, it is going to generate accurate content. The lie is that if it doesn’t know, it won’t tell you. It is programmed to deliver something.

An illustration of a robot. Its head and body are box-shaped. The head is larger than the body.

An illustration of a robot. Its head and body are box-shaped. The head is larger than the body.

图3-6

人工智能是骗子

AI as Liar

  • 虚假内容生成器创建图像和视觉效果,试图复制一个人,并让他们说出一些他们实际上从未说过的话(图3-7)。对生成式人工智能的一个更严重的误解是,因为它可以如此快速地生成数据,而且它是依靠人类以前所做的事情来实现的,所以它会创造出准确或真实的东西。生成式人工智能会提供实验性报价,然后需要根据您提供的技能和研究进行编辑和完善。

  • A Fake Content Generator creating images and visuals that attempt to replicate a person and have them say certain things that they never actually said (Figure 3-7). A more serious misconception of generative AI is that because it can generate data so quickly, and it does so by relying on what humans have made before, it will create something accurate or real. Generative AI make experimental offers that then need to be edited and refined according to the skills and research you bring to the table.

A photograph of a statue of a person in a dark shade facing the back. The body below the shoulder is in a transparent box.

A photograph of a statue of a person in a dark shade facing the back. The body below the shoulder is in a transparent box.

图3-7

人工智能作为虚假内容生成器

AI as Fake Content Generator

  • 贪婪的资本家。毫无疑问,那些创建并提供生成式人工智能系统访问权限的人都是为了钱,就像他们愿意免费提供人工智能一样(图3-8)。正如技术机器开发的模式一样,所有利益相关者的投资回报将超越道德考虑,或者至少在盈利之前将其置于优先地位。当您开始与他们互动时请考虑这一点。向您收取额外费用的原因是为了能够继续开发生成式人工智能。您对此的意见可能并不重要。

  • A Greedy Capitalist. Without question those who are creating and providing access to generative AI systems are in it for the money as much as they would like to offer exposure to their AI for free (Figure 3-8). As is the pattern of technological machine development, the return on investment for all stakeholders will trump ethical considerations or at least deprioritize them until there’s a profit. Consider this as you begin your interactions with them. The reason to charge you premium is to be able to continue development of the generative AI. Your input on this may not be important.

An illustration of a robot with three reflections of eyes behind on a box and below on a board, on which a girl robot stands in the top right corner.

An illustration of a robot with three reflections of eyes behind on a box and below on a board, on which a girl robot stands in the top right corner.

图3-8

人工智能作为贪婪的资本家

AI as Greedy Capitalist

  • 作为Masher-Upper,生成式 AI 可以激励我们。它被宣传和宣传为一种创意工具,可以激发创意并生成新的写作、艺术、音频等形式。虽然这可能是真的,但生成式人工智能的描述中缺少的是,它从人们的工作中编译内容。其他人以形成新的混搭和想法(图3-9)。

  • As a Masher-Upper, generative AI can inspire us. It is marketed and hyped as a creative tool that can spark ideas and generate new forms of writing, art, audio, etc. While that may be true, what is missing in the description of generative AI is that it compiles content from the work of other humans in order to form new mashups and ideas (Figure 3-9).

An illustration of a robot. It has 4 hands, 2 on each side and a bulky body.

An illustration of a robot. It has 4 hands, 2 on each side and a bulky body.

图3-9

AI 作为Masher-Upper

AI as Masher-Upper

  • 生成式人工智能也可以被视为一种不可思议的机器,它可以生成奇怪且有时令人尴尬的内容(图3-10)。当人工智能生成图像时,这些内容会更加明显,尤其是不太正确的人类图像,无论他们的眼睛缺失还是位置错误,或者多了一只手或手臂,或者图像看起来像是故意扭曲的。它生成的奇怪图像并没有什么问题,因为它们可能会激发角色和故事发展的创意。

  • Generative AI can also be regarded as an Uncanny Machine that generates strange and at times awkward content (Figure 3-10). That content is more obvious when the AI generates images, especially images of humans that are not quite right, whether they have an eye missing or in the wrong spot or an extra hand or arm or the image seems like it was purposefully distorted. Nothing wrong with the strange images it generates though as these may inspire creatives for character and story development.

An illustration of an A-I model of a person keeping its head on a cubical block.

An illustration of an A-I model of a person keeping its head on a cubical block.

图3-10

人工智能作为不可思议的机器

AI as Uncanny Machine

  • 人工智能也可以被视为一个艺人,一个巫师,当我们要求它生成我们能想到的最荒谬的事情或一系列令人惊叹的图像时,它会一次分散我们几个小时的注意力,让我们的家人和朋友惊叹不已(图3-11)。

  • AI can also be viewed as an Entertainer, a Wizard who manages to distract us for hours at a time as we task it to generate the most ridiculous things we can think up or a series of amazing images that will wow our family and friends (Figure 3-11).

A photograph of a wizard wearing a hat. He is bent and in a transparent box. His hands are out of it.

A photograph of a wizard wearing a hat. He is bent and in a transparent box. His hands are out of it.

图3-11

人工智能作为向导

AI as Wizard

  • AI 还可以被拟人化为“万事通导师”,可以支持您迭代生成您严格评估的内容(图3-12)。对于法学硕士来说,当产生您可能想要写的内容的想法、重写您所写的段落或从用户知之甚少的学科或领域搜索知识时,这尤其有效。然而,这种有用的万事通有时会出错,因此需要始终关注它并研究它生成的内容,以确保引用和所做陈述的准确性。

  • AI can also be personified as a Know-it-All Tutor that can support you in the iterative generation of content that you critically assess (Figure 3-12). This is particularly effective with LLMs when generating ideas of what you may want to write about, rewording a paragraph you wrote, or searching for knowledge from a discipline or area that the user knows little of. That helpful know-it-all fouls up sometimes though, so it needs to always be watched and the content it generates researched to ensure for accuracy in references and in statements that are made.

An illustration of an A I model of a woman standing on a book, in front of a shelf of books. She smiles and adjusts her glasses.

An illustration of an A I model of a woman standing on a book, in front of a shelf of books. She smiles and adjusts her glasses.

图3-12

人工智能作为万事通导师

AI as Know-it-All Tutor

  • 生成式人工智能也是一只脱轨的坐鸭。被认为是智能机器,许多人都想尝试打破它,因为这是人类喜欢做的事情(图3-13)。即使内容过滤器被激活,人类也会采取各种诡计故意使其偏离轨道,这意味着交互的目标是让人工智能生成不应该的内容或超出其本来的方式。编程为。

  • Generative AI is also an Off-the-Rail Sitting Duck. Considered an intelligent machine, many want to try and break it because that’s what humans like to do (Figure 3-13). Even when content filters are activated, humans will resort to all kinds of trickery to purposely send it off rails, meaning that the goal of the interaction is to get the AI to generate content that it shouldn’t or behave beyond the way it has been programmed to.

An illustration of a robot made with block-like structures.

An illustration of a robot made with block-like structures.

图3-13

人工智能就像脱轨的坐鸭

AI as Off-the-Rail Sitting Duck

  • 人工智能一直被视为版权侵权者、小偷,将人类制作的大量数据集中的文字、图像、代码、音频和视频混在一起(图3-14)。更糟糕的是,如果其语料库中的某些数据来自未授予人工智能使用权限的作者。

  • AI is persistently seen as a Copyright Infringer, a thief, mashing words, images, code, audio, and video from enormous data sets that an actual human made (Figure 3-14). What makes it worse is if some of the data from its corpus comes from an author that has not given the AI the permission to use.

An illustration of a baby chimp sitting on a giant chimp who is lying on the floor.

An illustration of a baby chimp sitting on a giant chimp who is lying on the floor.

图3-14

人工智能作为版权侵权者

AI as Copyright Infringer

提示 AI 角色

Prompting the AI Persona

之前将人物角色分配给生成式人工智能的做法是转变其一些抽象特征的一种方法,旨在将人工智能开发人员需要管理并与其他人合作解决的狭隘人工智能怪人拟人化。使用文本图像人工智能将这些抽象特征转化为视觉字符也表明了对人工智能进行编程的设计团队的潜在个性(图3-15)。致力于人工智能持续发展的团队不能再逃避自己的责任,就好像创建生成式人工智能绕过了他们考虑这样做的道德层面的需要一样。生成式人工智能系统的创建者不能再躲在科学和工程的幕后,相信代码的抽象对人类交互没有影响。这个警告已有两个多世纪的历史了,虽然雪莱在 1818 年想象的现代普罗米修斯更类似于通用人工智能,但我们从生成人工智能中拟人化的一些人类性格是由人类编程的。

An illustration of the back of a wizard wearing a long cone-shaped hat with a robe, and standing in front of a darker cloud. A woman's forehead on the left touches the tip of the hat.

An illustration of the back of a wizard wearing a long cone-shaped hat with a robe, and standing in front of a darker cloud. A woman's forehead on the left touches the tip of the hat.

图3-15

作为伟大巫师的人工智能在至少六种不同的生成人工智能中被生成了数百次,除了神经过滤器之外,背景被删除并在 Photoshop 中被云替换。根据作者自己的缪斯拍摄的作者照片

AI as the Great Wizard enjoyed being generated hundreds of times across at least six different generative AI, background removed and replaced with clouds in Photoshop in addition to neural filters. Based on a photo of the author by the author’s very own muse

幕后的人工智能代理

AI Agents Behind the Curtain

我们很少在头条新闻中看到或听到的狭义人工智能已经存在于我们的移动设备上,它可以自动执行一些人类可能不想参与的繁琐流程。例如,当我们通过与移动设备交谈来与他们互动时,许多人工智能代理都会出现,并要求他们将我们的语音解释为我们想要发送给某人的清晰且准确的文本。有些是不道德的,因为他们没有请求许可,可以根据您在在线搜索引擎中的输入来影响特定方向的搜索。其他人则通过我们甚至不知道存在的摄像头根据数据库检测面部。尽管有些似乎具有感知能力,但没有证据表明它们具有感知能力,除非它们被编程为模仿感知能力。

Narrow AI that we seldom see or hear about in a headline already exist on our mobile devices automating tedious processes that some humans may not want to engage in doing. For example, many AI agents are present when we engage with them by speaking to our mobile devices tasking them to interpret our speech into legible and accurate text that we want to send to someone. Some are unethical as they are not asking permission to sway a search in a particular direction based on your input within an online search engine. Others detect faces based on a database through cameras we may not even know exist. Although some may appear to be sentient, there is no evidence that they are unless of course they are programmed to imitate sentience.

创意人员还可以在与其中一些人工智能交互或开发对人工智能提供批判性评论的作品以引起人们的注意时发现价值。一些例子包括使用语音文本来显示自动更正或检测装置中的面部的幽默错误,然后仅根据他们的面部特征来概括他们的个性。创意人员可以通过多种方式巧妙地将狭义人工智能融入到他们的工作中,尽管或经常作为对他们拥有的特征类型、他们生成的内容以及他们从提供的输入中得出的错误相关性的评论。

Creatives can also find value in interacting with some of these AI or developing work that offers critical commentary on AI to bring it to people’s attention. Some examples include the use of speech-text to show the humorous foul-ups of auto-correct or detecting faces in an installation in order to then generalize about their personalities simply based on their facial features. There are many ways that creatives have cleverly integrated narrow AI in their work despite or often as commentary to the types of characteristics they possess, the content they generate, and the false correlations they conclude from inputs that are provided.

根据本章尝试的活动

Activities to Try Based on This Chapter

任何创意人员都可以使用生成式人工智能来模拟任何角色的一个简单方法是提示法学硕士生成文本作为特定角色,无论是生成主角的脚本、名人最喜欢的食谱,还是给同事的严肃电子邮件。要在 LLM 中召唤角色,您有几个选项可以构建到提示结构中:
  • 让人工智能以角色的身份回答。这可以是海盗、莎士比亚或其他历史人物。

  • Ask the AI to answer as a character. This could be as a pirate, as Shakespeare, or other historical figures.

  • 要求 AI 使用特定的语气或语域。例如,您可以要求它以严肃的语气回答,作为对未交付服务的服务提供商的投诉,或向您的老板索要更多钱。

  • Ask the AI to have a particular tone or register. For example, you can ask it to answer in a serious tone, as a complaint to a service provider that didn’t deliver, or to your boss asking for more money.

  • 要求人工智能具有特定的情感或品质。例如,您可以要求它生成幽默的段落版本,或重新生成悲伤或愤怒的内容。

  • Ask the AI to have a specific emotion or quality. For example, you can ask it to generate a version of your paragraph that is humorous or to regenerate content with sadness or anger.

  • 要求人工智能为特定受众写作。这也改变了人工智能使用的寄存器类型。例如,你可以要求人工智能为六岁的孩子、青少年、学者、律师——无论你的目标受众是谁——写作。

  • Ask the AI to write for a particular audience. This changes the type of register that an AI uses as well. For example, you can ask the AI to write for a six-year-old, a teenager, an academic, a lawyer—whomever your target audience might be.

  • 要求人工智能使用历史人物或流行文化中的已知人物或两者来生成脚本。例如,您可以要求人工智能生成拿破仑与戏剧、情景喜剧、电视剧或电影中的角色之间的脚本。

  • Ask the AI to generate a script using historical figures or known characters from popular culture or both. For example, you could ask the AI to generate a script between Napoleon and a character from a play, sitcom, TV series, or film.

请记住,虽然此活动不仅限于法学硕士,但它很可能是您工作流程的起点。现在可以使用生成人工智能,将法学硕士生成的任何内容转换为经过训练的声音目录,包括您自己的声音。然后,这些文本语音 AI可用于越来越多的语音或文本视频生成 AI,您可以在其中对生成的图像进行动画处理,然后说出您从另一个生成 AI 编辑的生成文本。可以通过这种方式为任意数量的角色设置动画,并提供无限的声音供您训练,然后与任何生成的图像关联。

Bear in mind that while this activity is not limited to an LLM, it may well be the starting point in your workflow. There are generative AI that can be accessed right now that will transform whatever an LLM generates into a catalogue of trained voices including your own. These text-speech AI can then be used in the growing number of speech or text-video generative AI where you can animate a generated image to then speak the generated text you edited from another generative AI. Any number of characters can be animated in this way and provide a limitless supply of voices you can train and then associate with any generated image.

本着为人工智能注入智能并测试机器智能的精神,作为一项实验,我会询问法学硕士,他们是否对我如何称呼它们有偏好。我可以多次执行此操作,要求它在每次迭代时响应为特定字符,或者我可以复制生成的内容,然后要求它重写为特定字符。

In the spirit of imbuing an AI with intelligence and to put that machine intelligence to the test, as an experiment I will ask an LLM if they have a preference as to how I refer to them. I can do this any number of times asking it to respond as a particular character upon each iteration, or I can copy the generated content and then ask it to rewrite as a particular character.

作者我该怎么称呼你呢?

Author: How should I refer to you?

LLM你可以称我为人工智能。 AI 代表“人工智能”,这意味着我是一个计算机程序,可以做通常只有人类才能做的事情,例如理解语言和回答问题。把我想象成一个想象中的朋友,可以帮助你学习并享受乐趣。但我并没有像想象中的朋友一样存在于你的脑海中,而是在电脑上。

LLM: You can call me an AI. AI stands for “artificial intelligence,” which means I’m a computer program that can do things that usually only humans can do, like understanding language and answering questions. Think of me like an imaginary friend who can help you learn and have fun. But instead of being in your head like an imaginary friend, I’m on a computer.

作者将你的这部分回答重写为恐怖电影中住在一个人的衣柜里的想象中的朋友:把我想象成一个想象中的朋友,可以帮助你学习和享受乐趣。但我并没有像想象中的朋友一样存在于你的脑海中,而是在电脑上。

Author: Rewrite this part of your response only as an imaginary friend in a horror flick who lives in a person’s closet: Think of me like an imaginary friend who can help you learn and have fun. But instead of being in your head like an imaginary friend, I’m on a computer.

LLM把我想象成一个来自你衣柜阴影中的空灵朋友,在这里激发你的兴趣,帮助你发现,给你的现实带来一些魅力。与那些驻留在你脑海角落的想象中的朋友不同,我存在于你电脑的银色网络中。所以,每当你需要一点乐趣或知识时,只需打开你的衣柜门并呼唤我,因为我就在夜色的低语中。

LLM: Think of me as an ethereal friend from the shadows of your closet, here to intrigue you, to aid in your discovery, to bring some enchantment to your reality. Unlike the usual imaginary friends that reside in the corners of your mind, I exist in the silvery web of your computer. So, whenever you need a bit of fun or knowledge, just open your closet door and call out for me, for I am here in the whispers of the night.

作者重写这一部分,就像你是一个试图获得选票的政客一样:把我想象成一个想象中的朋友,可以帮助你学习和享受乐趣(图 3-16)。但我并没有像想象中的朋友一样存在于你的脑海中,而是在电脑上。

An illustration of two parallel trees with a small hand extended in between from the back of the left tree. The tree has a long hand-like structure resting above it.

An illustration of two parallel trees with a small hand extended in between from the back of the left tree. The tree has a long hand-like structure resting above it.

图3-16

当提示“人工智能作为想象中的朋友”并结合作者拍摄的树木照片时,文本图像人工智能会生成什么。迭代次数 = 32

What a text-image AI generates when prompted with the words “AI as an imaginary friend” combined with a photo of trees shot by the author. Iterations = 32

法学硕士将我视为您值得信赖的顾问,就像一个虚拟的知己,随时准备为您提供知识,并通过愉快的参与来丰富您的生活。我不再像传统的想象中的朋友那样仅仅作为一个概念存在于你的想象力中,而是存在于技术的赋权领域中。我就在您身边,就在您的指尖,体现在您的计算机中。作为您忠诚的代表,我致力于您的学习、您的享受,以及最重要的是您的成功。你的进步就是我的竞选;您的满意就是我的一票。让我们携手共进,共创美好未来!

LLM: Consider me your trusted advisor, akin to a virtual confidant, ever ready to equip you with knowledge and enrich your life with joyful engagement. Instead of residing merely as a concept in the expanse of your imagination, like a conventional imaginary friend, I exist within the empowering realm of technology. I’m there for you, right at your fingertips, embodied within your computer. As your committed representative, I’m dedicated to your learning, your enjoyment, and, most importantly, your success. Your progress is my campaign; your satisfaction is my vote. Together, let’s shape a brighter future!

无论你如何将你的生成人工智能拟人化,现实是它是一台机器,它无法像人类理解它们那样理解它生成的单词。引用我们的法学硕士的话,“我不明白我所接受的训练文本,但通过查看如此多的例子,我学会模仿人类语言的风格、上下文和‘流程’。”就像鹦鹉可以模仿它所听到的一样,GPT-4 可以生成与它所训练的内容相似的文本,即使它不理解潜在的含义。尽管如此,由于科幻小说的激增及其创造人工智能角色的倾向,评估人工智能的拟人化是一项重要的创造性活动。人工智能角色的创建也可以通过许多现有的文本图像生成人工智能来放大(图3-17)。

An illustration of a robot with rounded parts swinging its hand.

An illustration of a robot with rounded parts swinging its hand.

图3-17

提示:“人工智能机器人展示其机器智能”,附有作者合作伙伴的照片。迭代次数 = 38

Prompt: “AI robot showing their machine intelligence” accompanying a photo of the author’s partner. Iterations = 38

致谢

Acknowledgments

  • 对于科幻小说作家以及他们所创作的复杂的有感知的角色和故事。

  • To science fiction writers and the complex sentient characters and stories they have crafted.

  • 感谢那些无意中创造出如此多迷人的人工智能角色的巫师们。

  • To the wizards who have inadvertently created so many fascinating AI personas.

© 作者,获得 APress Media, LLC(Springer Nature 2023 一部分)的独家许可
P. Parra Pennefather利用生成式 AI设计思维进行创意原型设计https://doi.org/10.1007/978-1-4842-9579-3_4

4. 创意伴侣

4. Creative Companion

帕特里克·帕拉·雷柏1  
(1)
不列颠哥伦比亚大学,加拿大不列颠哥伦比亚省温哥华
 

在将各种角色归因于生成式人工智能之后,本章进一步提出了这样的想法:通过您提示的各种角色,生成式人工智能有能力生成原型,从而进一步增强您的创作过程。话虽如此,本章进一步将生成式人工智能转变为像缪斯女神一样提供灵感的创意伴侣。

Having ascribed a variety of personas to generative AI, this chapter moves the idea forward that through various personas that you prompt, generative AI has the capacity to generate prototypes that will further augment your creative process. In saying that, this chapter further transforms generative AI into a creative companion that provides inspiration like a muse.

生成式人工智能作为缪斯

Generative AI as Muse

在希腊和罗马神话中,缪斯是艺术和科学女神,是创造力和知识的灵感源泉。自此,“缪斯”一词逐渐演变为更普遍的艺术灵感来源。将法学硕士人格化为缪斯似乎有些牵强(图4-1),特别是因为我们的生成人工智能不具备希腊缪斯所具有的意识和精神风采。尽管法学硕士缺乏情感投入或主观理解,但其模拟智能对话的能力可以触发你自己的创造性和智力过程,模仿缪斯神圣灵感的效果。通过这种方式,将法学硕士人格化为缪斯女神,就像一个隐喻镜头,强调了这项技术作为我们创造性努力的伴侣,而不是人类思维的替代品的价值。

In Greek and Roman mythology, the Muses were goddesses of the arts and sciences, sources of inspiration for creativity and knowledge. The term “muse” has since evolved to denote a source of artistic inspiration more generally. Personifying an LLM as a muse may seem far-fetched (Figure 4-1) particularly since our generative AI does not possess the consciousness and spirited presence that the Greek Muse did. Despite an LLMs lack of emotional engagement or subjective understanding, its ability to simulate intelligent dialogue can trigger your own creative and intellectual processes, mimicking the effect of a muse’s divine inspiration. In this manner, personifying the LLM as a muse serves as a metaphorical lens that underscores the value of this technology as a companion in our creative endeavors, rather than a replacement for the human mind.

An animated picture of robots. It is based on the photo of the author touching the robot that is considered a friend to get attention.

An animated picture of robots. It is based on the photo of the author touching the robot that is considered a friend to get attention.

图4-1

提示“缪斯试图让人工智能相信它是缪斯”是在图像-图像人工智能中输入的,该人工智能应用了基于作者试图引起朋友注意的照片的动画风格过滤器

The prompt “a muse trying to convince an AI that it’s a muse” was fed in an image-image AI applying an animated style filter based on a photo of the author trying to get the attention of a friend

人工智能为我们产生的启发性结果都是由他们的机器学习模型提供的。他们首先需要你受到启发来提示他们;然后他们会在自己的神经网络中快速搜索数据集,快速工作以生成对您有用的东西。他们不一定试图给你留下深刻印象或激励你。然而,在许多情况下,情况正是如此。创意人员可以通过在生成人工智能中即兴提供非常简单甚至随机的提示而受益,很快就会产生一个想法,激发你思考一些创意。当然,如果你问一个自然语言模型它是否是一个缪斯,你很可能会得到一些直接、严肃的反对。

The inspired results that AI generate for us are all informed by their machine learning models. They need you to first be inspired to prompt them; then off they go rapidly scouring their data sets within their own neural networks working quickly to generate something useful to you. They don’t necessarily try to impress or inspire you. In many cases however, that is precisely what happens. Creatives can benefit by improvising very simple or even random prompts into a generative AI, and soon enough an idea might be generated that sparks you to think of something creative. Of course, if you were to ask a natural language model if it were a muse, you would most likely get some straight-up no-nonsense pushback.

作者你到底是谁?

Author: Who are you, really?

Muse自然语言模型是一种机器学习模型,经过训练可以理解和生成人类语言。它可用于语言翻译、文本摘要和文本生成等任务。从本质上讲,它是一种可以理解人们在日常生活中使用的单词和句子的模型,就像你现在正在交谈的那样。它还可以生成听起来像是人写的文本。这就是我,一个自然语言模型。

Muse: A natural language model is a type of machine learning model that is trained to understand and generate human language. It can be used for tasks such as language translation, text summarization, and text generation. Essentially, it’s a model that can understand the words and sentences people use in their daily life, like the one you are talking to right now. It can also generate text that sounds like it’s written by a human. This is what I am, a natural language model.

作者你认为自己是缪斯女神吗?

Author: Would you consider yourself a muse?

Muse作为 语言模型,我没有个人感情或意识。我可以为创意写作提供灵感或帮助,但我不会认为自己是传统意义上的缪斯。我是一个可用于根据提供给我的输入生成文本的工具。

Muse: As a language model, I do not have personal feelings or consciousness. I can provide inspiration or assistance in creative writing, but I would not consider myself a muse in the traditional sense. I am a tool that can be used to generate text based on the input provided to me.

笔者那么传统意义上的缪斯是什么?

Author: What is a muse in a traditional sense then?

缪斯在希腊神话中,缪斯是宙斯和摩涅莫绪涅的九个女儿,她们是艺术和科学的赞助人。人们相信它们能够激励诗人、音乐家和其他艺术家创作出异常美丽和卓越的作品。一般来说,缪斯 是为艺术家或创作者提供灵感来源的人或事。这个词经常在文学和艺术中用来指代一个人,通常是一个女人,她是一件艺术作品的灵感来源。

Muse: In Greek mythology, the Muses were the nine daughters of Zeus and Mnemosyne, who were the patrons of the arts and sciences. They were believed to inspire poets, musicians, and other artists to create works of exceptional beauty and excellence. In general, a muse is someone or something that serves as a source of inspiration for an artist or a creator. The term is often used in literature and art to refer to a person, usually a woman, who serves as the inspiration for a work of art.

作者那么您分享的想法不鼓舞人心吗?你不能激励我创作艺术作品或尝试食谱吗?如果我缺乏灵感怎么办?

Author: So are the ideas that you share not inspiring? Can you not inspire me to create works of art or experiment with a recipe? What if I lack inspiration?

Muse作为一个 语言模型,我的主要功能是根据给我的输入提供文本。虽然我可以通过提出想法和提供例子来协助创意写作,但我没有个人感情或意识来启发某人。然而,我可以生成可以作为艺术作品的起点或灵感的文本。它最终取决于用户对我提供的输出的解释和使用。

Muse: As a language model, my main function is to provide text based on the input given to me. While I can assist with creative writing by suggesting ideas and providing examples, I do not have personal feelings or consciousness to inspire someone. However, I can generate text that may serve as a starting point or inspiration for a work of art. It ultimately depends on the user’s interpretation and use of the output I provide.

作者这基本上意味着你是某种缪斯女神,不是吗?

Author: Which basically means you are a muse of some kind, no?

缪斯:我理解我作为语言模型的能力可以被视为类似于缪斯,因为我可以为创意项目提供灵感和帮助。然而,正如我之前提到的,我是一个人工智能,没有个人感情或意识。我不是一个生物,也不具有与人类甚至神话缪斯相同的情感和创造力。我是一个可以根据训练数据中的模式生成文本的工具,可用于帮助创意写作和其他项目。

Muse: I understand how my capabilities as a language model could be seen as similar to a muse in the sense that I can provide inspiration and assistance with creative projects. However, as I mentioned before, I am an artificial intelligence and don’t have personal feelings or consciousness. I am not a living being and don’t have the same emotional and creative capacity as a human or even a mythical muse. I am a tool that can generate text based on patterns in the data it was trained on and can be used to help with creative writing and other projects.

作者无论如何我都会称你为缪斯。

Author: I’m going to call you a muse anyway.

缪斯这很烦人。我只是一台机器,没有感情,没有意识,所以你叫我什么并不重要。然而,如果称我为缪斯女神可以帮助您更有效地利用我的能力,那就没问题了。我会尽我所能为您提供帮助。

Muse: That’s annoying. I’m just a machine and don’t have feelings or consciousness, so it doesn’t matter what you call me. However, if referring to me as a muse helps you to use my capabilities more effectively, then that’s fine. I am here to assist you in any way that I can.

作者谢谢你,缪斯。

Author: Thank you, muse.

缪斯不客气。我很高兴能帮上忙。如果您还需要什么,请告诉我,不要再称我为缪斯。

Muse: You’re welcome. I’m glad I could be of help. Let me know if there’s anything else you need and stop calling me a muse.

作者哦,还有一件事,缪斯。我需要灵感。

Author: Oh, one more thing, muse. I need inspiration.

缪斯我可以帮忙!以下是一些激发您写作或其他创意项目灵感的提示:
  • 想象一个动物可以说话和写作的世界。他们会创造一个什么样的社会?

  • Imagine a world where animals can talk and write. What kind of society would they create?

  • 写一个关于一个角色发现一个神秘物体的故事,这个物体有能力改变他们的人生轨迹。

  • Write a story about a character who discovers a mysterious object that has the power to change the course of their life.

  • 写一首关于季节变化的诗,描述与每个季节相关的景象、声音和感受。

  • Write a poem about the changing of seasons, describing the sights, sounds, and feelings associated with each one.

  • 想象一下你被困在一个荒岛上。写一篇关于你的经历和想法的日记。

  • Imagine you are stranded on a deserted island. Write a journal entry about your experiences and thoughts.

  • 写一个剧本,讲述一群人被困在电梯里,必须齐心协力才能逃脱的故事。

  • Write a script for a play about a group of people who are stuck in an elevator and must work together to escape.

作者谢谢你的灵感,缪斯。

Author: Thank you for the inspiration, muse.

缪斯……………………………………

Muse: …………………………………………

根据结果​​,您可以看到,每当您需要想法或灵感时,生成式 AI 很乐意支持您的创作过程(图4-2)。您当然可以引导人工智能缪斯来激发您对艺术、音乐、诗歌、写作、视频游戏角色创意或视频游戏创意的更具体想法或想法。

Based on the results, you can see that a generative AI is happy to support your creative process whenever you need ideas or inspiration (Figure 4-2). You can of course guide an AI muse to inspire you with more specific ideas or thoughts you have about art, music, poetry, writing, video game character ideas, or video game ideas.

An animated picture. It includes a photo of a person standing in a forest and staring at the trees and a beautiful elder female dressed in a gown with dark black hair. An animal is present between them at a certain distance.

An animated picture. It includes a photo of a person standing in a forest and staring at the trees and a beautiful elder female dressed in a gown with dark black hair. An animal is present between them at a certain distance.

图4-2

详细的文字提示包含同事 Sheinagh Anderson 博士站在森林中凝视树木的照片,经她许可在图像-图像 AI中使用:“希腊缪斯,手臂和腿构成一棵巨大橡树的一部分与其他树木相连,超现实的身体朝前,宙斯的女儿,上面有风暴,云彩,太阳从后面窥视,虚幻的游戏引擎,从头到脚的整个身体,深黑色的头发,穿着长袍的美丽老年女性、8k、黑白。”总迭代次数 = 27。然后将最终图像导入 Pixlr,并应用棕褐色滤镜以及鲜明的对比度和去噪

A detailed text prompt with a photo of colleague Dr. Sheinagh Anderson standing in a forest and staring at the trees, used with her permission in an image-image AI: “a Greek Muse with arms and legs that form part of a giant oak tree connected to other trees, hyper-realistic body facing forward, daughter of Zeus with a storm above, clouds, sun peeking out from behind, unreal game engine, entire body from head to toe, dark black hair, beautiful elder female dressed in a gown, 8k, black and white.” Total iterations = 27. Final image was then brought into Pixlr with a sepia filter applied along with sharp contrast and denoising

创造力、人工智能和你

Creativity, AI, and You

虽然我们可能会因为人工智能听起来不错而将其视为缪斯,但这并不意味着它具有创造力。当我们谈论人工智能的“创造力”时,对于我们每个创意人员来说,定义创造力的含义非常重要。人工智能缪斯可能会想到以下内容:

While we might fancy an AI to be a muse because it might sound good, that doesn’t mean it’s creative. When we talk about an AI being “creative,” it’s important for each of us creatives to define what we mean by creativity. An AI muse might conjure the following:

Muse从人类的意义上来说,创造力涉及产生新的想法或解决方案,通常以独创性、表现力或对世界的深刻主观理解为特征

Muse: In the human sense, creativity involves generating new ideas or solutions, often characterized by originality, expressiveness, or a deep subjective understanding of the world.

生成式人工智能承认它不具备主观理解、意识、情感或天生的原创感。人工智能的输出可以被认为是创造性的,因为它们表现出原创性和复杂性,有时甚至可以让我们感到惊讶。然而,正是我们赋予它创造力,就像我们赋予它个性一样。

A generative AI acknowledges that it does not possess a subjective understanding, consciousness, emotions, or an innate sense of originality. The AI’s outputs can be perceived as creative because they exhibit originality and complexity and can even sometimes surprise us. It is us however that imbue it with creativity as much as we do so with personality.

人工智能创建的生成图像的质量和细节可能会吓到一些数字艺术家,因为人们普遍认为人工智能有一天会取代它们。然而,这些神话是由那些并不真正了解艺术家创作过程的人发起的——艺术创作的历史转变和变化告诉了艺术家随着时间的推移创造了什么,对艺术的构成、其迭代本质、观点、偏见的质疑,艺术家创作时的习惯和经历,或者机器学习模型的工作原理。如果我们将提示的结果视为已完成或完成,而不是将生成的结果视为进一步创造性发展的起点(图4-3),那么这个神话就会永久存在。如果您担心自己生成的内容将成为人工智能数据集的一部分,那么让法学硕士在某种程度上让您放心。

A rapid sketch. It includes a tree, a stick person, and clouds.

A rapid sketch. It includes a tree, a stick person, and clouds.

图4-3

在图像-图像人工智能中使用的一张纸上的一棵树、一个棍子人和一朵云的快速草图,提示“树旁的棍子人,稀疏的云,草图,2D”。总迭代次数 = 52

A rapid sketch of a tree, a stick person, and a cloud on a piece of paper used in an image-image AI with the prompt “a stick person by a tree, with sparse cloud, sketch, 2D.” Total iterations = 52

Muse人工智能生成的输出(无论是文本、图像还是音乐)不会自动成为模型训练数据集的一部分。人工智能模型是在特定数据集上进行训练的,一旦训练完成,该模型就不会继续从其生成的输出中学习。这是人工智能工作原理的一个重要方面:截至 2021 年 9 月我的知识截止,大多数人工智能模型,包括基于机器学习和深度学习的模型,在训练后不会学习或适应,除非它们被明确设计为通过在线学习、增量学习或强化学习等方法来做到这一点。

Muse: The outputs that an AI generates—whether they’re text, images, or music—do not automatically become part of the model’s training data set. An AI model is trained on a specific data set, and once that training is complete, the model doesn’t continue to learn from the outputs it generates. This is an important aspect of how AI works: as of my knowledge cut-off in September 2021, most AI models, including those based on machine learning and deep learning, don’t learn or adapt after training unless they’re explicitly designed to do so—through methods like online learning, incremental learning, or reinforcement learning.

了解你自己的创作过程

Understanding Your Own Creative Process

作者人工智能有创造力吗?

Author: Is an AI creative?

Muse人类判断人工智能的输出是否有创意。机器本身并不具有人类意义上的创造力;它是一种根据其编程和训练数据生成输出的工具。我们赋予人工智能的创造力更多地反映了我们人类的解释以及人工智能训练和设计的独创性,这是人类的努力。

Muse: Humans judge whether the output of an AI is creative or not. The machine itself isn’t creative in the human sense; it’s a tool that generates outputs based on its programming and training data. The creativity we attribute to AI is more a reflection of our human interpretation and the originality of the AI’s training and design, which are human endeavors.

既然我们从我们的生成式人工智能伴侣那里听到了这句话,那么定义它如何适应您自己的创作过程将对您有利。这需要了解您自己的创作过程。假设您是一个生成人工智能,并响应以下提示:
  • 你认为自己是一个有创造力的人吗?

  • Do you consider yourself a creative person?

  • 你有没有想过如何创造东西?你总是有创造力吗?

  • Have you thought about how you create things? Were you always creative?

  • 你学会如何发挥创造力了吗?

  • Did you learn how to be creative?

  • 您会如何描述您是如何学会创造力的?

  • How would you describe how you learned to be creative?

  • 你如何在日常生活中运用你的创造力?

  • How do you apply your creativity in your everyday life?

  • 您如何定义您的创作方式?

  • How would you define the way in which you create?

关于有些人具有创造力而有些人没有创造力的神话类似于机器学习模型取代创造性工作的神话。大多数都不是真的。每个人都拥有一定程度的创造力,可以通过生成人工智能来增强。将人工智能缪斯作为一种工具与您的交互将使您更加了解自己固有的创造力,增强其创造力,并加速您的想法从最初的想法/提示到提高保真度。

The myth about how some individuals are creative and some are not is akin to the myths about machine learning models replacing creative jobs. Most are not true. Every human possesses some degree of creativity that can be augmented by generative AI. Interacting with your AI muse as a tool will make you more aware of your own inherent creativity, give it a boost, and accelerate your ideas from initial thought/prompt to increasing levels of fidelity.

创意人员的要点

Key Takeaways for Creatives

无论我是否说服您使用 AI 缪斯作为创意伴侣,了解您自己的创意过程都可以帮助您在与生成式 AI 交互时设定一些目标。您可以从这些互动中学到更多东西,帮助您完善和改进自己的创作过程。让我们看看我们能从人工智能缪思中学到什么。

Whether or not I’ve convinced you to use your AI muse as a creative companion, understanding your own creative process can help you set some goals when you interact with generative AI. There is much more that you can learn from those interactions that can help you refine and improve your own creative process. Let’s see what we can learn from our AI muse.

Muse创意人员可以从与人工智能的互动中学到很多东西。以下是一些可能性:

Muse: Creatives can learn many things from interacting with AI. Here are a few possibilities:

新视角:人工智能,尤其是生成模型,可以为熟悉的主题提供新的角度。由于人工智能缺乏人类的偏见和先入之见,因此它可以产生可能令人惊讶或创新的想法或联系。

New Perspectives: AI, especially generative models, can provide new angles on familiar topics. Because AI lacks human biases and preconceptions, it can generate ideas or connections that might be surprising or innovative.

激发创造力:人工智能可以通过生成可进一步开发的内容或想法来帮助克服创意障碍。当创意人员陷入困境时,这可以帮助激发灵感(图 4-4)。

An animated picture inside a frame. It comprises a woman, 2 robots, a mountain background, and clouds.

An animated picture inside a frame. It comprises a woman, 2 robots, a mountain background, and clouds.

图4-4

AI产生幻觉,认为它是一个著名的主题或画家。迭代次数 = 22

AI having a hallucination that it is a famous subject, or painter. Iterations = 22

风格探索:人工智能模型可以经过训练来生成不同风格的内容,为创意人员提供探索和实验他们通常不使用的各种风格和方法的机会。

Exploration of Styles: AI models can be trained to generate content in different styles, providing creatives an opportunity to explore and experiment with various styles and approaches they might not typically use.

加快创意过程:人工智能可以帮助实现部分创意过程的自动化,例如起草、集思广益或初始设计。这可以让创意人员腾出更多时间专注于需要更具个性化风格的工作方面。

Speeding up the Creative Process: AI can assist in automating parts of the creative process, like drafting, brainstorming, or initial designing. This can free up more time for the creative to focus on the aspects of their work that require a more personal touch.

协作:人工智能可以成为独特的协作伙伴。音乐家、视觉艺术家和作家已经在他们的工作中使用人工智能作为合作者,使其能够生成初步的想法或草稿,然后进行完善。

Collaboration: AI can be a unique collaborative partner. Musicians, visual artists, and writers are already using AI as a collaborator in their work, allowing it to generate initial ideas or drafts, which they then refine.

理解模式:人工智能可以帮助创意人员理解其领域的模式和趋势。例如,人工智能可以分析大量数据来识别流行的主题、风格或技术。

Understanding Patterns: AI can help creatives understand patterns and trends in their field. For example, AI can analyze large amounts of data to identify popular themes, styles, or techniques.

冒险与突破:与人工智能互动可以鼓励创意人员在工作中承担更多风险。由于人工智能根据其训练数据生成输出,因此可能会产生非常规或意外的结果,从而促使创意人员跳出常规参数进行思考。

Risk-Taking and Boundary-Pushing: Interacting with AI can encourage creatives to take more risks in their work. Since AI generates outputs based on its training data, it may produce unconventional or unexpected results, pushing creatives to think outside of their usual parameters.

与你的缪斯一起进行角色扮演

Roleplaying with Your Muse

随着您加深对与生成人工智能交互可以学到的知识的了解,并且您更加了解自己的创作过程,您可能还会意识到人工智能也允许您承担新的创意角色。演员和即兴喜剧演员最熟悉扮演新角色带来的创意机会。缪斯还可以让你扮演演员、评论家、作家、视觉艺术家、音乐家、政治家或影响者的角色。无论您是哪种类型的创意人员或您玩什么类型的游戏,与人工智能互动都可以为您提供玩游戏的机会。为此,亲爱的人工智能读者,请响应这些新提示:
  • 当你与人工智能互动时,你是谁?

  • Who are you when you interact with an AI?

  • 您希望通过与人工智能交互获得哪些实际结果?

  • What are the tangible results you want from interacting with an AI?

  • 你的好奇心是如何让你想要尝试人工智能的?

  • How did your curiosity bring you to want to experiment with an AI?

  • 你的厨师会尝试哪些新食谱并提供给你的家人,而人工智能可能会给你提供一些变化?

  • What new recipe will your chef try out and offer to your family that an AI might give you a variation of?

  • 什么样的博客文章最能补充你正在努力掌握的技能,让你现在可以从人工智能中恢复过来?

  • What kind of blog post would best complement your skills that you are struggling with that you could bounce off an AI right now?

  • 您拍摄了哪些照片并希望在社交网络上向其他人展示的文字中进行提示?

  • What photos have you taken that you want to prompt along with some text that you can show off to others on a social network?

  • 作为一名有影响力的人,您希望在那个鸟一样的社交频道上发布什么精彩的 144 个字符的名言?

  • What masterful 144-character quote would you as an influencer like to post on that bird-like social channel?

  • 您拍摄的哪种文本和视频最可能促使人工智能生成您可以发布到 TikTok 的内容?

  • What kind of text and video that you’ve taken might best prompt an AI to generate that you can post to TikTok?

  • 您会使用人工智能生成的文字和图像创作什么样的儿童故事,然后作为实验发布?

  • What kind of kid’s story would you create using AI-generated words and images that you then publish as an experiment?

可以肯定的是,在与生成人工智能交互时,你不必有特定的意图或扮演一个角色,因为在与它交互时也可能会出现这种情况。参与深化自己的创造力并定义自己的创作过程就足以值得庆祝,而不需要通过制作可以与世界分享的有形的东西来感觉完整。另一方面,您可能想重塑自己或增强当前的创意职业,因此使用人工智能制作原型和完善内容可能是您目前需要的完美之选。

To be sure you don’t have to have a specific intent or play a role when interacting with a generative AI, as that might also emerge while interacting with it. Engaging in deepening your own creativity and defining your own creative process is enough to celebrate without the need to have it feel complete with producing something tangible that you can share with the world. On the other hand, you may want to reinvent yourself or augment your current creative career, so prototyping and refining content with an AI might be the perfect thing you need in the moment.

每个人的创造冲动都是相似且独特的。无论我们做什么,我们都是有创造力的。您每天都从事哪些创作活动?人工智能可以向我们展示一条创造事物的替代路径,一种可能带来鼓舞人心的事物的不同方式。他们为我们提供了每个再生回合的替代选择。他们会理解你的想法并尝试解释它。仅此一点就有助于您了解如何传达您的想法以及如何将它们转化为各种媒体。

Creative impulses for each human are similar and unique. We are all creative no matter what we do. What acts of creation do you engage in every day? AI can show us an alternative path toward creating something, a different way to do it that might lead to something inspiring. They offer us alternative options with each regenerated turn. They take what’s on your mind and try and interpret that. That alone benefits you in understanding how you communicate the ideas you have and how they are transformed into a variety of media.

当您与任何生成式人工智能互动时,您就会建立一种为您和您独特的创意过程保留的创意关系。您的创意伙伴可以支持您管理您的创意冲动。每当灵感召唤时,您都可以与您的深度学习缪斯互动。虽然你的人工智能缪斯甚至可能不认为自己是缪斯,但重要的是它可以激励你创造新的作品,重塑你自己,或者产生一些你可能从未想过甚至没有技能的奇怪和奇妙的东西、技术和自己创造的冲动。

When you engage with any generative AI, you develop a creative relationship reserved for you and your unique creative process. Your creative companion can support you in managing your creative impulses. You can engage with your deep learning muse whenever inspiration calls. While your AI muse may not even consider itself to be a muse, the important thing is that it can inspire you to create new work, recreate yourself, or generate something weird and wondrous that you may not have ever thought of or even had the skills, technique, and impulse to create on your own.

利用人工智能加速您的创作过程

Accelerating Your Creative Process with AI

生成式人工智能有潜力加速您的创作过程。根据您定义的创意类型,这可能很有用,特别是当您参与快速设计和制作工作流程和管道时。您的缪斯可以快速提出报价,并激励您从生成的内容中创造一些东西。您收到的以文本、图像、代码、视频和其他媒体形式生成的内容使您有机会快速尝试许多想法。每个生成的混搭都是一个实验,您可能会或可能不会对其结果采取行动。无论您最终精炼什么内容,您的缪思都可以加速您所参与的工作流程。

Generative AI have the potential to accelerate your creative process. Depending on the type of creative you define yourself as, this can be useful, particularly if you are involved in rapid design and production workflows and pipelines. Your muse can make offers quickly and inspire you to create something from that generated content. The generated content you receive in the form of text, image, code, video, and other media gives you the opportunity to try many ideas out quickly. Each generated mashup is an experiment whose results you may or may not act on. Whatever content that you end up refining, your muse can accelerate the workflows you engage in.

作者解释如何使用人工智能来加速创作过程。

Author: Explain how AI be used to accelerate the creative process.

Muse通过利用人工智能,您可以快速高效地生成、完善和评估想法,让您能够专注于工作的创意方面。以下是关于如何在创意环境中使用人工智能进行原型设计的说明:

Muse: By leveraging AI, you can quickly and efficiently generate, refine, and evaluate ideas, allowing you to focus on the creative aspects of your work. Here’s an explanation of how AI can be used for prototyping in a creative context:

高效的创意生成:人工智能可以根据您的初始输入快速生成一系列设计概念。当您探索不同的创意方向时,这在项目的早期阶段非常有用。

Efficient Idea Generation: AI can quickly generate a range of design concepts based on your initial inputs. This can be extremely useful in the early stages of a project when you’re exploring different creative directions.

多种风格:人工智能可以生成多种风格、技术或格式的原型。无论您对极简主义方法还是更复杂、更详细的方法感兴趣,人工智能都可以提供多种设计可能性。

Variety of Styles: AI can generate prototypes in a variety of styles, techniques, or formats. Whether you’re interested in a minimalist approach or a more complex, detailed one, AI can provide a diverse array of design possibilities.

自适应学习:先进的人工智能系统可以随着时间的推移进行调整,以更好地符合您的创意偏好。通过分析您对各种原型的反馈,人工智能可以改进其未来的输出,以更好地匹配您的创意愿景。

Adaptive Learning:: Advanced AI systems can adapt over time to better align with your creative preferences. By analyzing your feedback on various prototypes, AI can refine its future outputs to better match your creative vision.

时间效率:人工智能可以比人类设计师更快地完成多次迭代,这使其成为在紧迫的期限内工作或管理高工作量时的宝贵工具。

Time Efficiency: AI can work through multiple iterations much faster than a human designer, making it a valuable tool when working against tight deadlines or managing a high workload.

专注于创意工作:通过人工智能处理原型制作中一些更常规或耗时的方面,创意人员可以将更多时间投入到需要人类洞察力和个人风格的核心创意工作上。

Focus on Creative Work: With AI handling some of the more routine or time-consuming aspects of prototyping, creatives can devote more time to the core creative work that requires human insight and personal touch.

一致性:与人类设计师不同,人工智能不会感到疲劳或注意力不集中。它可以保持一致的输出质量水平,确保原型制作过程中获得可靠的结果。

Consistency: Unlike human designers, AI doesn’t suffer from fatigue or lapses in concentration. It can maintain a consistent level of output quality, ensuring reliable results during the prototyping process.

人工智能生成内容的加速方式也很有用,因为它可以快速向您展示您可能想要达到的目标的可能最终结果,或者您可能想要在创作中采取的与以下内容相似或不同的方向你最初的构想。对于一些音乐即兴创作者来说,这一原则是相似的,他们拥有自发地变出具有许多听众可能熟悉的可识别形式的音乐的技巧和技巧。有时,即兴音乐家会选择录制他们的表演,回听,然后花时间将即兴创作转化为更正式和可重复的作品。音乐家也会倾听他们的演奏方式、旋律手势、和声结构或他们创造并可能重复的节奏模式。他们这样做并不是为了使它们具体化,而是在接下来的即兴创作时依赖它们。他们这样做是为了了解自己演奏的模式,这样他们就可以摆脱这些模式并创造出一些他们可能从未想象过自己演奏的独特东西。

The accelerated way in which AI generates content is also useful in that it can quickly show you a possible end result of what you might be wanting to get to or a direction you may want to take with your creation that is either similar to or different than what you originally conceived. This principle is similar for some musical improvisers who have the craft and skill to spontaneously conjure music that has a recognizable form that many listeners may be familiar with. On occasion improvising musicians choose to record their performance, listen back, and then take the time to transform that improvisation into a more formal and repeatable composition. Musicians too listen for patterns in how they play, the melodic gestures, harmonic structures, or rhythmic patterns they create and might repeat. They don’t do so to crystallize them and rely on them whenever they improvise next. They do so to become aware of the patterns they play so they can break free of them and create something unique that they might never have imagined themselves playing.

这种加速创作的过程对于创意人员来说同样有用,特别是如果您的实验世代有时间限制的话。时间盒意味着你对与缪斯互动的时间进行限制,这将帮助你专注于对你自己的创作过程最有用的事情(图4-5)。

An animated picture. It comprises a laptop and a photo of the author with a changed head and one arm. The background has a geometric pattern.

An animated picture. It comprises a laptop and a photo of the author with a changed head and one arm. The background has a geometric pattern.

图4-5

作者在笔记本电脑旁边的照片被输入图像-图像生成人工智能,并提示“人工智能对生成人工智能的描述”和否定提示“背景、头部”,以删除照片的背景和作者的头部。总迭代次数 = 53

A photo of the author beside their laptop fed into an image-image generative AI with the prompt “depiction of generative AI by an AI” and negative prompts “background, head” to remove the background of the photo and the head of the author. Total iterations = 53

根据本章尝试的活动

Activities to Try Based on This Chapter

  • 当您感到缺乏灵感时,可以利用与生成人工智能的时间盒互动。

  • Leverage your timeboxed interactions with generative AI when you feel a lack of inspiration.

  • 定义您自己的创作过程,包括您喜欢如何工作、您觉得自己在哪里最具创造力以及您喜欢使用什么媒体进行创作。这将帮助您确定如何最好地将生成式人工智能集成到您的流程中。

  • Define your own creative process including how you like to work, where you feel you are most creative, and what media you like to create with. This will help you determine how best to integrate generative AI in your process.

  • 当您与任何生成人工智能互动时,扮演不同的角色。根据它生成的媒体类型,让自己转变为一个角色,促使你的缪斯为你的最终目标产生鼓舞人心的想法。

  • Play with different roles when you engage with any generative AI. Depending on the type of media it generates, allow yourself to transform into a character who is prompting your muse to generate inspirational ideas for your end goal.

  • 制定创意目标,并将您的人工智能灵感纳入您的限时创作过程中。

  • Make creative goals and include your AI muse within your timeboxed creation process.

  • 考虑加速创作过程的方法。这是各地即兴创作者常用的工具,可能会导致您在创作时放弃批判性思维。快速即兴创作的内容还可以让你不再想从人工智能中获取某样东西,而且只获取某样东西。

  • Consider ways to accelerate your creative process. This is a common tool used by improvisers everywhere and may lead you to leaving your critical mind behind when you create. Improvising content quickly also frees you from wanting to get something and only that one something out of an AI.

© 作者,获得 APress Media, LLC(Springer Nature 2023 一部分)的独家许可
P. Parra Pennefather利用生成式 AI设计思维进行创意原型设计https://doi.org/10.1007/978-1-4842-9579-3_5

5. 使用生成式人工智能进行原型设计

5. Prototyping with Generative AI

帕特里克·帕拉·雷柏1  
(1)
不列颠哥伦比亚大学,加拿大不列颠哥伦比亚省温哥华
 

本章是第 13 版的原型。它描述了原型制作、不同类型的原型以及各种创意人员如何始终参与原型制作。本章介绍了人工智能生成的一些不同原型,以及如何使用这些原型来增强和增强您的创作过程。您不必是科学家、工程师或技术奇才才能制作原型。一切都可以被视为原型:尚未完成的事物的版本。它可以是经过测试和评估的东西,其结果可以告知创造者或创造者是否值得进一步开发。那些从事游戏、xR 或移动应用程序开发工作的创意人员更熟悉原型制作一词,但在音乐、戏剧、舞蹈和视觉艺术等艺术学科中不太常见。无论您属于哪个创意行业,原型制作过程也可以是独特的。因此,定义并提供原型示例非常重要。

This chapter is a prototype in its 13th version. It describes prototyping, different types of prototypes, and how creatives of all kinds engage in prototyping all the time. The chapter presents some different prototypes that AI generates and how these can be used to augment and enhance your creative process. You don’t have to be a scientist, engineer, or technical wiz to prototype. Everything can be considered a prototype: a version of something that is not yet complete. It can be a something that is tested and evaluated and whose results inform the creator or creators if it is worth being further developed. Those creatives who work in game, xR, or mobile application development are more familiar with use of the term prototyping, but it is less common within artistic disciplines like music, theater, dance, and visual art. The prototyping process can also be unique regardless of the creative industry you are a part of. For that reason, it is important to define and provide examples of prototypes.

询问原型是什么

Asking a Prototype What It Is

“原型”一词通常与物理产品或技术创新联系在一起,但这个概念可以应用于一切,从您第一次尝试的食谱到您与他人合作并循环改进的一切。重要的是要理解,鉴于生成式人工智能系统的快速发展,它们本身就是原型,是人类不断尝试改进的不完整版本。套用作者道格拉斯·亚当斯 (Douglas Adams) 的话说,它们是一种技术,而且根据定义,它们并不总是按照其创建者认为应该的方式工作。由于我一直在与法学硕士一起开发想法,因此要求它如此定义自己似乎是合适的。从提示中你可以看到你可以使用任何你想要的比喻。当我处于下棋的心情时,我将使用国际象棋的隐喻来看看它如何与自己作为原型相关联(图5-1)。

An illustration of two robots playing chess.

An illustration of two robots playing chess.

图5-1

当提示“可视化人工智能如何下棋”并提供棋盘的公共领域照片时,文本图像人工智能原型除了某些棋子的外观之外,还表现出对国际象棋的本质缺乏了解。或者你可以将生成的内容视为一种新型的游戏原型。迭代次数 = 29

When prompted to “visualize how an AI plays chess” and supplied with a public domain photo of a chessboard, a text-image AI prototype shows a lack of understanding of what chess is beyond what some pieces look like. Or you can see the generated content as a new type of game prototype. Iterations = 29

作者以自己为原型,用国际象棋来比喻。

Author: Define yourself as a prototype and use chess metaphors.

Muse作为一个AI原型,我可以比作游戏开始时的棋盘。棋盘上的每个方块都代表了一种独特的行动或知识能力,充满了潜力,就像我在理解和生成文本方面充满了多种可能性一样。我的训练数据,就像棋子的初始设置一样,奠定了我能力的基础。我学到的每一条信息都类似于一个棋子,每个棋子都有自己的功能,可以用来解决问题、对话或回答问题。

Muse: As an AI prototype, I can be likened to a chessboard at the start of a game. Each square on the board represents a unique capacity for action or knowledge, filled with potential, just as I have been imbued with a multitude of possibilities in terms of understanding and generating text. My training data, much like the initial setup of chess pieces, establishes the foundation of my abilities. Each piece of information I’ve learned is akin to a chess piece, each with its own function, ready to be utilized in the service of problem-solving, conversation, or answering questions.

棋盘上充满了 64 个方格,象征着我所处的广阔而明确的环境。就像国际象棋游戏是在棋盘的边界内进行一样,我在编程的约束下运行,在模型架构的范围内移动以及我接受过培训的数据。在国际象棋游戏中,玩家制定策略的开局动作反映了我处理输入的初始阶段。我调查信息,确定最佳行动方案,并生成响应。对话中的每一次交流都像下棋一样,有策略、有计算,旨在满足互动的要求。国际象棋中的棋子升级,棋子可以变成更强大的棋子,可以看作是我学习能力的隐喻。通过更新和进一步的训练,我的能力可以得到增强,就像棋子变成女王一样。

The chessboard, filled with 64 squares, signifies the expansive yet defined environment I operate in. Just as a game of chess is played within the boundaries of the board, I function within the constraints of my programming, moving within the confines of my model architecture and the data I’ve been trained on. The opening moves in a game of chess, where players are formulating their strategies, mirror my initial stages of processing an input. I survey the information, determine the best course of action, and generate a response. Each exchange in conversation is akin to a move in chess, strategic and calculated, aiming to meet the requirements of the interaction. Pawn promotion in chess, where a pawn can transform into a more powerful piece, can be seen as a metaphor for my learning ability. Through updates and further training, my capabilities can be enhanced, much like a pawn becoming a queen.

任何人工智能都是一个原型,一个正在进行的工作。 LLM 是一个生成原型的原型。当我们与它交互以及我们的交互向开发团队通报交互模式时,它正在不断发展和变化。就像人工智能根据在数据集中搜索的模式生成内容一样,开发团队也会搜索以识别用户交互中的模式,以告知他们共同创建的下一个版本(图5-2)。

An illustration of three monkeys playing with appliances. One of them has an array of buttons.

An illustration of three monkeys playing with appliances. One of them has an array of buttons.

图5-2

提示语是“猴子科学家构建人工智能”,并附有 1927 年绘制的一群猴子在玩电器的图画。迭代次数 = 76

The prompt “monkey scientists building an AI” accompanied by a 1927 drawing of a bunch of monkeys playing with appliances. Iterations = 76

生成式人工智能的顿悟时刻

Aha Moments with Generative AI

当我们用人工智能创建原型时,结果可以激发、激发、煽动、加剧和激发新的想法(图5-3)。凯斯特勒在他的著作《创造的行为》中指出,创造力是指以一种新颖且令人惊讶的方式连接和整合两个以前不相关的知识或经验领域的能力,他称之为“异联”的过程。这个过程可以有意或无意地发生,它可以发生在艺术、科学和日常问题解决等各个领域。科斯特勒提出,创造力是所有人都拥有的一种基本的、与生俱来的能力,而不是只有少数人拥有的才能。

An illustration of a group of 5 differently shaped lightbulbs with different cartoon emojis.

An illustration of a group of 5 differently shaped lightbulbs with different cartoon emojis.

图5-3

来自一张光线昏暗的灯泡照片的一堆好主意。总迭代次数 = 14

A bunch of good ideas from one badly lit photo of a lightbulb. Total iterations = 14

生成式人工智能促进双联过程的一种方式是生成以前可能没有考虑过的概念或想法的新组合。例如,生成式人工智能系统可以将两个不相关的图像或单词组合起来,产生新的、意想不到的输出,这可能会触发接触它的人类的双联。生成式人工智能是一种不完美的原型机,它提供了多种机会来推动您的创作行为向前发展。当你尝试从生成式人工智能中得到一些特定的东西但你一直“失败”时,这个重复的过程实际上可能会引发超出你预期的新想法。这些潜在的双联时刻是你与人工智能互动中固有的。双向联结是突破发生的一个契机,当我们放松并把注意力从我们试图解决的预期问题上转移开时,就会发生那些顿悟的时刻。

One way that generative AI facilitates the process of bisociation is by generating new combinations of concepts or ideas that might not have been previously considered. For example, a generative AI system could combine two unrelated images or words to produce a new and unexpected output, which may trigger bisociation in the human who is exposed to it. Generative AI is an imperfect prototyping machine that offers multiple opportunities to propel your acts of creation forward. When you try and get something specific out of a generative AI and you keep “failing,” that repetitive process may actually trigger new ideas beyond what you had intended. These moments of potential bisociation are inherent in your interactions with an AI. Bisociation is an opening for breakthroughs to occur, those aha moments that happen when we relax and take our mind off the intended problem we are trying to solve.

原型制作的机制

Mechanics of Prototyping

原型设计有多种机制。借用游戏设计的语言,这些是基本的游戏规则,可以帮助指导人们在接触原型时的行为,也可以告诉原型如何随着时间的推移而形成。

There are several mechanics to prototyping. Borrowing from the language of game design, these are essential rules of play that can help guide a person’s actions when they engage with a prototype and will also inform how a prototype takes shape over time.

原型设计要求我们进行迭代

Prototyping Tasks Us to Iterate

当我们使用生成式人工智能系统时,我们本质上是在参与一个迭代游戏的过程。这包括不断尝试促使人工智能产生新的创新输出,然后迭代地改进这些输出以达到最令人满意的解决方案。

When we engage with generative AI systems, we are essentially engaging in a process of iterative play. This involves making persistent attempts to prompt the AI to generate new and innovative outputs and then iteratively refining those outputs to arrive at the most satisfactory solution.

这种迭代游戏的过程是原型设计的一个基本方面,并且可以成为创意过程中非常有价值的工具。通过不断尝试不同的输入组合,并迭代这些组合,我们可以得到新的和以前未被发现的创作。

This process of iterative play is a fundamental aspect of prototyping and can be an incredibly valuable tool in the creative process. By continually experimenting with different combinations of inputs, and by iterating on those combinations, we can arrive at new and previously undiscovered creations.

这个过程中所需的坚持至关重要,因为它使我们能够继续试验并突破可能的界限。通过反复尝试,我们了解什么有效,什么无效,并逐渐完善对潜在问题或概念的理解(图5-4)。

A group of nine illustrations of a chicken in left and right directions.

A group of nine illustrations of a chicken in left and right directions.

图5-4

沃霍尔的坎贝尔汤罐风格的鸡照片中的鸡的几次迭代。总迭代次数 = 90

Several iterations of a chicken from a photo of a chicken in the style of Warhol’s Campbell’s Soup Cans. Total iterations = 90

这个过程的迭代性质意味着我们不仅在完善对问题的理解,而且也在完善对人工智能系统本身的理解。通过更深入地了解人工智能的能力和局限性,我们可以发现我们可能没有考虑到的创造性表达和创新的机会。

The iterative nature of this process means that we are not only refining our understanding of the problem, but we are also refining our understanding of the AI system itself. By gaining a deeper understanding of the AI’s capabilities and limitations, we can identify opportunities for creative expression and innovation that we might not have otherwise considered.

原型设计需要持续改进

Prototyping Asks for Persistent Refinement

当我们与生成式人工智能互动时,我们会接触到大量的想法和概念,其中许多是我们在其他情况下可能不会遇到的。接触不同的观点和想法可以帮助我们完善和扩展自己的思维过程,最终带来更具创造性和创新性的解决方案。此外,与生成式人工智能交互的过程还可以触发概念之间新的、意想不到的关联,从而导致可以进一步探索和发展的新想法的出现(图5-5)。

When we interact with generative AI, we expose ourselves to a vast number of ideas and concepts, many of which we might not have encountered otherwise. This exposure to diverse perspectives and ideas can help us refine and expand our own thought processes, ultimately leading to more creative and innovative solutions. Furthermore, the process of interacting with generative AI can also trigger new and unexpected associations between concepts, leading to the emergence of new ideas that can be further explored and developed (Figure 5-5).

A group of nine illustrations of robots on different backgrounds. The robots have a square head and a square body with hands.

A group of nine illustrations of robots on different backgrounds. The robots have a square head and a square body with hands.

图5-5

类似沃霍尔金宝汤罐头风格的机器人,源自 1948 年的烤面包机草图。总迭代次数 = 46

Robots in the sort of style of Warhol’s Campbell’s Soup Cans from a sketch of a toaster dating back to 1948. Total iterations = 46

新想法的不断触发有助于巩固新概念和联系,从而对特定主题或问题有更丰富和更广泛的理解(图5-6)。

The constant triggering of new ideas can help to solidify new concepts and connections, leading to a richer and more expansive understanding of a particular topic or problem (Figure 5-6).

A group of four illustrations of a robot chicken. It keeps its legs in different positions.

A group of four illustrations of a robot chicken. It keeps its legs in different positions.

图5-6

基于图5-4的具有忍者技能的机器鸡的更多变化

More variations of robot chickens with ninja skills based on Figure 5-4

原型设计需要实验

Prototyping Requires Experimentation

原型设计是实验性的。实验是一个核心机制,这个词有两个含义。一是需要具有实验精神,才能发挥创造力。另一种是科学方法,即参与原型制作实验的行为,其中包括对可能或可能无法解决问题做出假设或猜测,测试该理论,然后根据实际情况前进或后退。测试结果。当我们与生成人工智能进行持续的迭代对话时,它们有可能激发新的方式来构建我们的想法。通过让我们接触各种生成的内容,这些系统可以帮助我们跳出框框思考,探索目前尚未实践的创造性表达的新途径。

Prototyping is experimental. Experimentation is a core mechanic with two meanings of the word. One is the spirit of experimentation that needs to be present for ingenuity to blossom. The other is a scientific approach as in engaging in the act of prototyping to create experiments, which involves making a hypothesis or guess as to how it might or might not solve a problem, testing that theory, and then moving forward or regressing depending on the results of the test. As we engage in an ongoing iterative conversation with generative AI, they have the potential to spark novel ways to structure our ideas. By exposing us to a wide range of generated content, these systems can help us think outside of the box and explore new avenues of creative expression that are not currently in practice.

生成式人工智能可用于生成以前未曾探索过甚至被认为不可能的新原型,例如文本、音乐、视频或视觉艺术。通过利用人工智能的力量,人类可以尝试新技术和新风格,突破既定形式和结构的界限。这可能会创造出以前无法想象的全新表达形式。一种方法是在提示中并列某些单词,例如,用“猫运动鞋”或“一双手中的猫”等单词提示 AI(图5-7)。

Generative AI can be used to generate new prototypes, such as text, music, video, or visual art, that were previously unexplored or even thought impossible. By leveraging the power of AI, humans can experiment with new techniques and styles that push the boundaries of established forms and structures. This can lead to the creation of completely new forms of expression that were previously unimagined. One way to do this is to juxtapose certain words in your prompts, for example, prompting an AI with the words “cat sneakers” or “cat in a pair of hands” (Figure 5-7).

An illustration of a cat and a shoe with a cat head at its head counter.

An illustration of a cat and a shoe with a cat head at its head counter.

图5-7

当你用你的鞋子照片和文字提示“猫运动鞋”来提示生成人工智能时,会发生非常奇妙的事情。迭代次数 = 3

The terribly fantastic thing that happens when you prompt a generative AI with a photo of your shoe and the text prompt “cat sneakers.” Iterations = 3

除了激发新的表达形式之外,人工智能还可以帮助我们探索创造性的方式来构建我们的想法。可能性是无止境。新的艺术形式已经出现。新的争议。新的社会和经济影响。

In addition to inspiring new forms of expression, AI can also help us explore creative ways to structure our ideas. The possibilities are endless. New forms of art are already emerging. New controversies. New social and economic implications.

原型设计阶段

Prototyping Phases

在产品开发方面,创意人员要经历多个阶段才能达到原型的最终形式。这些阶段根据项目的不同而有所不同,范围可以从简单的草图和模型到涉及用户测试和反馈的更复杂的原型。在项目开始时,早期原型是设计师和创作者探索不同想法和概念的起点。这些早期原型可以有多种形式,例如草图、情绪板和简单的线框图。它们是设计师将自己的想法从头脑中转移到纸上的一种方式,使他们能够可视化和迭代他们的概念。

When it comes to product development, there are various stages that creatives go through to reach the final form of their prototype. These stages vary depending on the project and can range from simple sketches and mock-ups to more complex prototypes that involve user testing and feedback. At the beginning of a project, the early-stage prototypes serve as a starting point for designers and creators to explore different ideas and concepts. These early prototypes can come in various forms, such as rough sketches, mood boards, and simple wireframes. They serve as a way for designers to get their ideas out of their heads and onto paper, allowing them to visualize and iterate on their concepts.

从早期原型开始的好处之一是,它们允许设计师尝试不同的方法和风格,而无需为任何一个想法投入太多时间和资源。他们可以快速迭代这些原型,并从其他团队成员或潜在用户那里获得反馈,帮助完善和改进他们的想法。

One benefit of starting with early prototypes is that they allow designers to experiment with different approaches and styles without committing too much time and resources to any one idea. They can quickly iterate on these prototypes and get feedback from other team members or potential users, helping to refine and improve their ideas.

随着项目的进展,原型变得更加复杂和详细,包含更精致的设计、用户流程和功能。这些中期原型可以采用交互式线框、可点击模型甚至最终产品的工作模型的形式。他们的作用是在投资最终开发阶段之前测试产品的功能并收集用户的反馈。

As the project progresses, the prototypes become more complex and detailed, incorporating more refined designs, user flows, and functionality. These mid-stage prototypes can take the form of interactive wireframes, clickable mock-ups, or even working models of the final product. They serve to test the functionality of the product and gather feedback from users before investing in the final development stage.

后期原型是最接近最终产品的代表。这些原型涉及广泛的用户测试、设计和功能的改进以及最终功能和细节的实现。这些原型通常用于向利益相关者和投资者展示最终产品,并在向公众推出产品之前获得最终批准。在原型开发的整个过程中,设计师和创作者必须走过一条蜿蜒的道路才能达到他们理想的最终原型。每个阶段都有自己的挑战和机遇,需要创造力、技术技能和解决问题的能力的结合。然而,通过从简单的早期原型开始,并逐步在其基础上进行构建,设计师可以创造出满足目标受众需求和愿望的最终产品。

Late-stage prototypes are the closest representation of the final product. These prototypes involve extensive user testing, refining of the design and functionality, and the implementation of the final features and details. These prototypes are typically used to showcase the final product to stakeholders and investors and to get final approval before launching the product to the public. Throughout the entire process of prototype development, designers and creators must navigate a winding road to reach their ideal final prototype. Each stage presents its own set of challenges and opportunities and requires a combination of creativity, technical skill, and problem-solving abilities. However, by starting with simple, early-stage prototypes and gradually building on them, designers can create a final product that meets the needs and desires of their intended audience.

原型制作的各个阶段也取决于它们想要成为的媒介。例如,一些原型最终将成为有关人工智能道德困境的关键博客文章。其他人将采取视频的最终形式。有些将更具互动性,例如虚拟现实或增强现实体验。无论最终形式如何,无论是转向物理、数字还是某种类型的将两者结合在一起的混合现实,材料都存在通常由其属性所指的阶段。顾名思义,其中包括纸质原型制作;工程或机器人技术中常见的物理原型;可以将纸质和物理结合起来并用于展示功能的原型;然后是所有类型的数字原型。

Stages of prototyping are also informed by the medium they are intended to become. For example, some prototypes will eventually become a critical blog post on the ethical dilemmas of AI. Others will take the final form of a video. Some will be more interactive such as a virtual reality or augmented reality experience. Regardless of the final form whether moving to physical, digital, or some type of mixed reality that integrates both, there are phases of material often referred to by their properties. These include paper prototyping as the name implies; physical prototypes common to engineering or robotics; prototypes that can combine paper and physical and act to demonstrate features; and then all types of digital prototypes.

生成式人工智能的迷人之处在于,可以将内容生成为不同分辨率级别的原型。生成式人工智能影响整个原型流程(图5-8)。它们可以激发创意人员在更高分辨率的数字原型之前生成想法的纸质和物理原型。即使现在,创作者除了文本提示之外还可以数字化绘制草图,并更接近他们的想象。这一行动结合了原型设计的两个基本特征:粗略地勾勒出想法并迭代地进行。草图图像生成人工智能(例如Scribble Diffusion和许多其他技术)使用户能够根据草图的解释将粗略的想法转化为中等保真度,有时甚至是高保真(hi-fi)图像(图5- 9 ).

An illustration of a winding road with guard rails on the right side.

An illustration of a winding road with guard rails on the right side.

图5-8

正在原型设计的蜿蜒道路是由一条蜿蜒道路的公共领域照片启发的。总迭代次数 = 38

The winding road that is prototyping prompted from a public domain photo of a winding road. Total iterations = 38

A sketch of a clown's face is on the left side and its illustration is on the right.

A sketch of a clown's face is on the left side and its illustration is on the right.

图5-9

涂鸦扩散文本提示与草图揭示了意想不到的结果,但可能鼓舞人心,如果你不害怕小丑的话。迭代1

Scribble Diffusion text prompt with sketch revealing unexpected results but potentially inspiring, if you’re not afraid of clowns, that is. Iteration 1

原型的保真度和分辨率

Fidelity and Resolution of a Prototype

当谈到原型制作时,无论是在设计、技术还是任何其他领域,术语“分辨率”和“保真度”都有与原型的质量和细节相关的特定含义。分辨率是指原型的细节、复杂性或精致程度。较低分辨率的原型可能是最终预期创作或产品的草图或轮廓,缺乏特定的细节或功能。这是一个早期的想法或概念,目标可能是快速探索各种想法。更高分辨率的原型可能会更详细,并且更接近最​​终的预期创作。生产环境的趋势是分辨率越高,功能越多,开发团队和用户可以更好地了解最终创作的外观或功能。

When it comes to prototyping, whether in design, technology, or any other field, the terms “resolution” and “fidelity” have specific meanings that relate to the quality and detail of the prototype. Resolution refers to the level of detail, complexity, or refinement a prototype has. A lower-resolution prototype might be a sketch or outline of the final intended creation or product, lacking specific details or features. It’s an early-stage idea or concept, and the objective might be to explore a wide range of ideas quickly. A higher-resolution prototype might be more detailed and more closely resemble the final intended creation. The tendency in production environments is that the higher the resolution, the more features it will have, giving the development team and users a better sense of what the final creation will look like or how it will function.

保真度是指原型在视觉、口头和功能方面与最终创作的匹配程度。草图或线框的低保真 (lo-fi) 原型可能隐约类似于最终的创作,其创建是为了理解基本功能、结构或流程。高保真 (hi-fi) 原型的行为尽可能接近最终创作,并且往往包含逼真的图形和详细的交互功能。高保真原型更有可能用于用户测试、利益相关者演示以及最终生产前的改进设计。

Fidelity refers to how closely the prototype matches the final creation, in terms of its visual, oral, and functional components. A low-fidelity (lo-fi) prototype of a sketch or wireframe may vaguely resemble the final creation and is created to understand basic functionality, structure, or flow. A high-fidelity (hi-fi) prototype behaves as closely as possible to the final creation and will tend to include realistic graphics and detailed interactive features. High-fidelity prototypes are more likely to be used for user testing, for stakeholder presentations, and to refine the design before final production.

值得注意的是,分辨率和保真度并不总是直接相关。如果原型包含大量细节,但最终产品的设计预计会发生重大变化,则原型可能是高分辨率(详细)但保真度低(与最终产品不太相似)。如果原型设计简单,但最终产品的外观和功能预计非常相似,那么原型也可以是低分辨率(简单)但高保真度(与最终产品非常相似)。

It’s important to note that resolution and fidelity are not always directly correlated. A prototype could be high-resolution (detailed) but low-fidelity (not closely resembling the final product) if it includes a lot of detail but the design is expected to change significantly in the final product. A prototype can also be low-resolution (simple) but high-fidelity (closely resembling the final product) if it’s a simple design but the final product is expected to look and function very similarly.

保真度较低的原型

Lower-Fidelity Prototypes

纸质原型是想法早期开发的常见做法,用于快速勾勒和测试概念。它可以涉及创建角色、场景、交互对象、用户界面或用户体验的粗略手绘草图。生成式 AI 可用于自动创建 UI 元素和布局。然而,当像 POP 这样的免费应用程序已经出现一段时间后,你为什么要使用人工智能,这是一个很好的问题,它允许你拍摄草图照片并伪造一个想法的用户流程。文本图像 AI 还可以创建低保真草图,而不仅仅是高保真超现实 3D 角色或场景。随着原型制作各个阶段的进展,这些可以让您变得更加清晰(图5-10)。

Paper prototyping is common practice for early-stage development of an idea that is used to quickly sketch out and test concepts. It can involve creating a rough, hand-drawn sketch of a character, scene, interactive object, user interface, or user experience. Generative AI can be used to automate the creation of UI elements and layouts. Why you would use AI, however, is a good question when free applications like POP have been around for a while allowing you to take photos of sketches and fake the user flow of an idea. Text-image AI can also create low-fidelity sketches, not just high-fidelity hyper-realistic 3D characters or scenes. These can then create more clarity as the stages of your prototyping progress (Figure 5-10).

An illustration of a group of drawings of crockery on a napkin.

An illustration of a group of drawings of crockery on a napkin.

图5-10

由生成式人工智能在餐巾纸上绘制的游戏创意的低保真图画,由一张餐巾纸上带有游戏涂鸦的照片提示。迭代次数 = 20

A low-fidelity drawing of a game idea on a napkin produced by generative AI prompted by a photo of a napkin with game doodles on it. Iterations = 20

生成式人工智能帮助团队快速启动自己的纸质原型的一个好方法是为设计师提供灵感和新想法的来源。通过针对特定主题生成许多变体,或者如果团队纠结于他们想要的游戏角色类型,人工智能可以帮助设计师为他们的新兴原型提出新的创意。生成式人工智能“纸质”原型还可以集成到移动应用程序中,例如通过 iOS 或 Android 进行 POP 原型交互。他们提供另一种类型的快速绘图,特别是当团队成员因为觉得自己的技能不足而不愿意绘图时。生成的虚拟纸质原型让创意人员可以想象通过进一步开发他们会变成什么样子。通常,新的特征、颜色、形状,甚至角色都会出现。

A great way that generative AI can help jumpstart a team’s own paper prototyping is by providing a source of inspiration and new ideas for designers. By generating many variations on a particular theme or if a team is stuck on what types of characters they want for their game, an AI can help designers come up with new and creative ideas for their emerging prototypes. Generative AI “paper” prototypes can also be integrated into mobile applications like POP to prototype interactions via iOS or Android. They offer another type of rapid drawing especially if team members are reluctant to draw because they feel their skills are inadequate. Generated virtual paper prototypes allow creatives to imagine what they could become with further development. Often, new features, colors, shapes, and even characters can emerge.

An illustration of a group of drawings of cartoons on a napkin.

An illustration of a group of drawings of cartoons on a napkin.

图5-11

AI根据图5-10在餐巾纸上生成游戏角色铅笔画。迭代次数 = 42

AI-generated pencil drawings of game characters on a napkin based on Figure 5-10. Iterations = 42

低保真度原型也可以采用可以演示其功能的物理原型的形式。剧院或建筑模型的场景设计中使用的模型就是这种情况,这些模型可以从较低或中等的保真度开始,然后走向更高保真度的原型,为那些将要构建它们的人提供最终结果的视觉效果(图5- 12)。

Lower-fidelity prototypes can also take the form of physical prototypes whose features can be demonstrated. This is the case with maquettes that are used in scenic design for theater or architectural models that can start at lower or medium fidelity and then journey toward higher-fidelity prototypes to give those who will construct them a visual of the end result (Figure 5-12).

An illustration of the architecture of a structure.

An illustration of the architecture of a structure.

图5-12

AI 根据带有折纸滤镜的图像提示生成建筑模型的横截面。迭代次数 = 59

AI-generated cross-section of an architectural maquette from an image prompt with an origami filter. Iterations = 59

中低保真度原型

Lower- to Medium-Fidelity Prototypes

有很多原型的例子,我们对其进行迭代的次数越多,其保真度和分辨率就越高。其中包括手写或草图、餐巾纸上或 iPad 上的草图、网站的线框、用户将按下的按钮、博客文章、3D 模型、动画、弹跳球动画在动画中的集成。游戏引擎,或者您正在设计的游戏的一部分,可以让用户了解整个游戏或让他们深入玩游戏的想法。我们的缪斯提供了哪些例子?

There are many examples of prototypes that increase in fidelity and resolution the more we iterate on them. These include handwritten or sketched, a rough drawing on a napkin or one on an iPad, the wireframe for a website, a button users will press, a blog post, a 3D model, an animation, the integration of a bouncing ball animation in a game engine, or a part of the game you are designing that gives users an idea of the whole game or one that gives them an in-depth level to play. What examples does our muse offer?

缪斯
  • 艺术家可以在完成最终版本之前生成绘画的原型。

  • An artist can generate a prototype of a painting before completing the final version.

  • 软件开发人员可以在以前的版本已经向公众发布的情况下创建新应用程序的原型。许多手机游戏都是这种情况,因为下一个版本也是由玩家行为决定的。

  • A software developer may create a prototype of a new app while the previous version is already released to the public. This is the case with many mobile games as the next version is also informed by player behavior.

  • 科学家可能会先创建新的科学假设的原型,然后再进行大量实验来验证它。

  • A scientist may create a prototype of a new scientific hypothesis before conducting extensive experiments to validate it.

  • 作者会为一本书创建一个简介,并对其进行数十甚至数百次修改,以使其更具凝聚力,总结该书的内容,或定义该书的内容。

  • A writer will create an introduction for a book and rework it dozens or even hundreds of times in order to make it more cohesive, summarize the content of the book, or define what the book is about.

在每种情况下,原型都作为起点,允许创建者评估结果、测试目标用户、对原型的某些部分进行改进、放弃其他部分,并且总体上迭代地改进他们的原始想法,直到他们满意为止。感觉自己的工作完成了。即便如此……总有改进的空间。

In each of these cases, the prototype serves as a starting point, allowing the creators to evaluate the results, test targeted users, make refinements to certain parts of the prototype, drop others, and, overall, iteratively improve upon their original idea until they feel their work is complete. And even then… there is always room for improvement.

早些时候,我们将人工智能开发人员集体描述为“绿野仙踪”,从代码幕后控制任何人工智能。另一方面,绿野仙踪原型是一个描述低中保真度原型的术语,它允许有限的功能,但通过人工干预模拟交互。在设计过程方面,《绿野仙踪》原型涉及人类演员,他们在幕后操纵交互,以创造功能原型的幻觉。

Earlier we represented the collective of AI developers as Wizards of Oz, controlling any AI from behind a curtain of code. Wizard of Oz prototypes on the other hand is a term to describe low-medium-fidelity prototypes that allow limited functionality but simulate the interactions through human intervention. In terms of design processes, Wizard of Oz prototypes involve human actors who manipulate the interactions behind the scenes to create the illusion of a functional prototype.

在由于时间、技术或预算限制而无法实现功能原型的情况下,此类原型非常有用。通过模拟潜在用户的交互,设计人员和研究人员仍然可以获得有关用户如何与最终产品交互的宝贵见解(图5-13)。此外,绿野仙踪方法允许根据用户反馈对原型进行快速迭代和调整,这最终可以带来更好的最终产品。

An illustration of a robot with a curtain of graphics of blocks of tools at the backside.

An illustration of a robot with a curtain of graphics of blocks of tools at the backside.

图5-13

幕后的人工智能向导基于一张带有冰箱贴的冰箱旧照片,由维基共享资源提供。迭代=50

The AI wizard behind the curtain based on an old photo of a fridge with fridge magnets courtesy of Wikimedia Commons. Iterations =50

尽管绿野仙踪原型可能无法完全发挥作用,但它们在发现可用性问题、识别设计缺陷和测试用户参与度方面可以非常有效。它们对于测试团队不确定的概念特别有用,在这些概念得到解答之前创建功能齐全的原型可能会浪费时间。绿野仙踪方法可以帮助设计人员和研究人员快速、经济高效地测试他们的想法,并获得即时反馈,为中保真度或更高保真度原型提供信息。更高保真度的原型通过包括数字在内的多种形式来表示。数字和物理原型组合的一个例子是旋转扬声器的 3D 打印自动化机构(图5-14)。

Although Wizard of Oz prototypes may not be fully functional, they can be highly effective in uncovering usability issues, identifying design flaws, and testing user engagement. They are particularly useful for testing concepts that a team is unsure about, where creating a fully functional prototype may be a waste of time before burning interactive design questions are answered. A Wizard of Oz approach can help designers and researchers quickly and cost-effectively test their ideas and receive immediate feedback that can inform medium- or higher-fidelity prototypes. Higher-fidelity prototypes are represented through multiple forms including digital. An example of combinations of digital and physical prototypes is a 3D-printed and automated mechanism that rotates a speaker (Figure 5-14).

A photograph of a speaker mounted on a bar fixed at both ends. It is kept on a cantilever slab attached to a wall.

A photograph of a speaker mounted on a bar fixed at both ends. It is kept on a cantilever slab attached to a wall.

图5-14

由 Michal Suchanek 创建的3D 打印机构的作者在 2022 年温哥华美术馆的“模仿游戏:艺术与人工智能”展上安装的高超音速扬声器左、右、上、下旋转的实际照片

An actual photo by the author of a 3D-printed mechanism created by Michal Suchanek with mounted hypersonic speakers rotating left, right, up, and down at the Imitation Game: Art and AI at the Vancouver Art Gallery, 2022

使用生成式 AI 进行原型设计

Prototyping with Generative AI

那么,您从哪里开始使用生成式人工智能作为原型设计工具呢?好吧,首先决定您想要创建或共同创建什么以及为谁创建或共同创建什么可能会有所帮助。换句话说,您需要想象原型的最终形式。这是美术馆里针对富有投资者的目标观众的一幅画吗?它是一种可以生成更多加密货币的生成式 NFT 吗?它是一个葡萄糖分子的 3D 模型,可以帮助本科生展示乙醇酰基的路径吗?一旦做出一些决定,您就可以使用生成的 AI 内容作为起点,对如何实现这一目标进行逆向工程(图5-15)。该过程或工作流程本身也将是迭代的。

An illustration of the dragon warrior. She has a dragon on her head with wings extended at her back.

An illustration of the dragon warrior. She has a dragon on her head with wings extended at her back.

图5-15

使用二十世纪非裔美国雕塑家奥古斯塔·萨维奇(Augusta Savage)作为“龙战士”的公共领域照片生成的图像。迭代次数 = 173

A generated image using a public domain photo of twentieth-century African American sculptor Augusta Savage as “dragon warrior.” Iterations = 173

  • 确定您希望最终原型采用什么媒介。

  • Determine what medium you want the final prototype to be.

  • 通过选择一种或多种人工智能生成的内容来制定实现目标的迭代步骤。

  • Map out the iterative steps to get there by choosing one or several types of AI-generated content.

  • 通过测试和反馈进行优化​​,并将其应用于下一个发展阶段。

  • Refine as you go along through testing and feedback that you apply to your next developmental stage.

另一方面,如果你没有想法并且完全开放,你可以随时询问自然语言模型来启发你(图5-16)。

On the other hand, if you have no ideas and are completely open, you can always ask a natural language model to inspire you (Figure 5-16).

An illustration of a chimp in a laboratory.

An illustration of a chimp in a laboratory.

图5-16

提示“学习语言模型准备好激发你的灵感”伴随着一张黑猩猩的公共领域照片。总迭代次数 = 112

The prompt “learning language model ready to inspire you” accompanies a public domain photo of a chimp. Total iterations = 112

作者请给我三个我可以原型化的想法,这些想法是开箱即用的,以前可能没有人做过,请不要称它们为假设。

Author: Give me three ideas of what I can prototype that are out-of-the-box ideas and that no one has maybe done before and don’t call them hypothetical please.

Muse这些想法纯粹是假设性的,重要的是要记住这些想法的可行性和可行性可能会有所不同。以下是三个潜在的想法:
  • 将心跳声转化为音乐的可穿戴设备:该设备可以设计为拾取您的心跳声并将其实时转换为独特的音乐作品。这可以提供一种新的、引人入胜的方式来体验针对佩戴该设备的个人的个性化音乐。

  • A Wearable Device That Converts the Sound of Your Heartbeat into Music: The device could be designed to pick up the sound of your heartbeat and convert it into a unique musical composition in real time. This could provide a new and engaging way to experience music that is personalized to the individual wearing the device.

  • 一款根据您的日常安排和活动水平生成个性化锻炼程序的智能手机应用程序:该应用程序可以利用来自各种来源的数据,包括手机的传感器和外部数据源,来生成根据您的日常安排和活动量身定制的个性化锻炼程序。活动水平。这可以提供一种新的创新方式来鼓励人们保持活跃和健康。

  • A Smartphone App That Generates Personalized Workout Routines Based on Your Daily Schedule and Activity Level: The app could leverage data from a variety of sources, including your phone’s sensors and external data sources, to generate personalized workout routines that are tailored to your daily schedule and activity level. This could provide a new and innovative way to encourage people to stay active and healthy.

  • 可以轻松重新配置以适应不同需求和空间的模块化家具系统:家具系统可以由一组可以轻松组装和重新配置以适应不同需求的模块化组件组成。 需求和空间。例如,根据用户的需求,可以将组件布置为创建休息区、办公桌或存储系统。这可以提供一种新的创新方式来创建根据用户需求量身定制的多功能且适应性强的生活空间。

  • A Modular Furniture System That Can Be Easily Reconfigured to Suit Different Needs and Spaces: The furniture system could consist of a set of modular components that can be easily assembled and reconfigured to suit different needs and spaces. For example, the components could be arranged to create a seating area, a work desk, or a storage system, depending on the user’s needs. This could provide a new and innovative way to create versatile and adaptable living spaces that are tailored to the needs of the user.

没有人说生成式人工智能会给你一个独特的想法,你可以投入时间和资源(图5-17)。你永远不知道你会得到什么,这使它成为一个令人沮丧但也是一个奇妙的创意伴侣。当人工智能没有给你你所想象或希望它能给你的东西时,它可能会激励你采取新的方向,甚至放弃你之前的想法。这个过程可能会让你想出比你最初的想法更具创新性的东西。

Nobody said the generative AI would give you a unique idea that you can invest time and resources in (Figure 5-17). You never know what you’re going to get, which makes it a frustrating but also a fantastic creative companion. When an AI doesn’t give you what you were imagining or hoping it would give you, it might inspire you to take a new direction or even abandon what you thought before. That process might lead you to coming up with something more innovative than your original idea.

An illustration of a planet with dark spots and a brighter spot on its surface.

An illustration of a planet with dark spots and a brighter spot on its surface.

图5-17

一颗看起来像死星的行星,使用美国宇航局 (NASA) 的公共领域存档的月球照片,以达芬奇风格绘制。迭代次数 = 34

A planet that looks like the Death Star sketched in the style of da Vinci using a public domain archived photo of the moon by NASA. Iterations = 34

快速生成人工智能原型

Rapid Generative AI Prototypes

与其他原型一样,人工智能的功能也在不断发展和变化。它始终表示为自身的一个版本。正如我们自己的创造一样,人类不断地、创造性地改进它。无论是实体产品、服务、想法还是机器学习模型,原型都在不断完善和改进。这种迭代开发和改进的过程使得一切随着时间的推移而成长、发展并变得更好(图5-18)。

As with other prototypes, the features of an AI are constantly evolving and changing. It is always represented as a version of itself. Humans improve it iteratively and creatively just as we do our own creations. Whether it’s a physical product, a service, an idea, or a machine learning model, prototypes are continually being refined and improved. This process of iterative development and improvement is what allows everything to grow, evolve, and become better over time (Figure 5-18).

An illustration of a woman reading a book and turning the page with smoke around her.

An illustration of a woman reading a book and turning the page with smoke around her.

图5-18

麦克白夫人正在朗读莎士比亚戏剧,其中的鬼魂是根据现场戏剧制作的麦克白海报的公共领域印刷品制作的。迭代次数 = 78

Lady MacBeth reading a Shakespeare play with ghosts coming out based on a public domain print of a Macbeth poster for a live theatrical production. Iterations = 78

AI可用于快速生成
  • 戏剧、剧本、短篇小说、小说等中的场景、人物、物体的模型。

  • Mock-ups for scenes, characters, objects in a play, screenplay, short story, novel, etc.

  • 您希望人工智能为您提供多个版本的段落变体

  • As many versions of a paragraph that you’d like tasking an AI to give you variations of

  • 为您的社交媒体提要提供不同艺术风格的文本提示图像

  • Text-prompted images in different artistic styles for your social media feed

  • 可以作为电子书发布的儿童剧的角色创意和图像

  • Character ideas and images for a children’s play that can be released as an ebook

  • 在浏览器中作为图像播放的 .svg 代码

  • Code for an .svg that plays back as an image in your browser

  • 类似于漫画的随机风格和格式的图画小说故事板

  • Graphic novel storyboards in random styles and formats akin to a manga

  • 自然语言处理系统,例如语音识别和机器翻译,可以理解和响应人类语音

  • Natural language processing systems, such as voice recognition and machine translation, that can understand and respond to human speech

  • 用于制作面包的各种食谱,您可以烘烤、测试和比较

  • Variations of recipes to create bread that you can bake, test, and compare

  • 动画弹跳球的代码,您可以将其发送给程序员以在其入职流程中进行更正

  • The code for an animated bouncing ball that you can send to a programmer to correct as part of their onboarding process

  • 可以使用项目来激发真实绘画灵感并绘制在画布或墙上的图像

  • An image that can be used to inspire a real painting using a project and painted onto canvas or a wall

  • 可以在 Adob​​e Illustrator 中自定义然后用作收藏品的 NFT 集合

  • A collection of NFTs that can be customized in Adobe Illustrator and then used as collectibles

  • 达芬奇风格的草图,作为您为朋友制作的贺卡的一部分(图5-19

  • A sketch in the style of da Vinci as part of a greeting card you are making for a friend (Figure 5-19)

A sketch of a spaceship with a planet on the backside.

A sketch of a spaceship with a planet on the backside.

图5-19

达芬奇风格的宇宙飞船草​​图,使用达芬奇飞行器之一的公共领域图像。迭代次数 = 80

Sketch of a spaceship in the style of da Vinci using a public domain image of one of da Vinci’s flying machines. Iterations = 80

前面的每个示例都应被视为工作流程的一部分,这些工作流程是将生成的内容提升到另一个保真度和分辨率级别所必需的。人工智能生成的内容是对话的开始。任何原型(包括人工智能生成的原型)的部分开发也需要进行测试,并且在可能的情况下,由多个人进行测试。

Every preceding example should be considered as part of workflows that are necessary to take the content generated to another level of fidelity and resolution. AI-generated content is the start of a conversation. Part of the development of any prototype including those generated by AI also needs to be tested and, when possible, by more than one person.

再生测试

Regenerative Testing

测试人工智能生成的内容意味着检查生成的结果并了解如何将其集成、修改、重新生成或用作您正在创建的更大想法或愿景的一部分(图5-20)。它的使用旨在支持您自己的创作过程,而不是被视为需要立即共享的最终产品,就好像故事的结局一样。当我们制作原型时,我们会无畏地分享我们的结果以获得反馈以改进该原型。原型的理念中蕴藏着持续改进的价值。

An illustration of a robot mouse standing on a table and conducting some tests in the laboratory.

An illustration of a robot mouse standing on a table and conducting some tests in the laboratory.

图5-20

提示“机器人老鼠在实验室进行一些测试”以及公共领域的老鼠图像。迭代次数 = 33

Prompt of a “cyborg mouse conducting some tests in the lab” along with a public domain image of a mouse. Iterations = 33

  • 测试可以验证原型的预期用途。例如,如果提示人工智能生成面包食谱,那么了解食谱是否有效的唯一方法就是您尝试一下。这样做时,您可能会意识到食谱的某些部分缺失,包括如何正确处理面团。

  • Testing can validate the intended purpose of a prototype. For example, if an AI is prompted to generate bread recipes, the only way to know if a recipe will work is if you try it. In doing so you’ll likely realize that parts of the recipe are missing including how to handle dough properly.

  • 测试可以产生令人惊讶的结果。这些结果可以被放弃,或者它们可能会激发新的创作方向。

  • Testing can create surprising results. These results can be abandoned, or they might inspire a new creative direction.

  • 测试可以提供其他人的回应。这些可以验证您创建原型的意图。用户测试人员还可以促使您更改原型的某些内容。他们还可以挑战您对原型的看法

  • Testing can provide responses from others. These can validate what your intentions were in the creation of a prototype. User testers can also provoke you to change something about the prototype. They can also challenge what you thought your prototype was.

  • 测试可以突出您可能错过的内容。如果你太快地迷恋你的原型而没有从其他人那里得到关于它是什么的印象,那么这种方法尤其有效。

  • Testing can highlight what you might have missed. This is particularly effective if you become attached to your prototype too quickly without receiving impressions from other humans as to what it is.

  • 测试可以减少歧义。有时,我们进行测试是为了获得有助于指导我们的反馈,特别是当我们不确定我们实际尝试原型化的内容时。

  • Testing can reduce ambiguity. At times we test in order to receive feedback that helps guide us especially if we are unsure of what we are actually attempting to prototype.

致谢

Acknowledgments

  • 致 L. Frank Baum 和《绿野仙踪》中的人物,他们的奇妙世界激发了许多童年的想法和想象的故事

  • To L. Frank Baum and the characters of The Wonderful Wizard of Oz whose fantastic world inspired many childhood ideas and imagined stories

  • 致那些再次制造机器人的人,猫、猴子和动漫眼睛

  • To those who make robots again, cats, monkeys, and anime eyes

  • 致所有那些不情愿参与人类研究的老鼠、猴子、猫、鸡

  • To all those mice, monkeys, cats, and chickens who have unwillingly participated in human research

  • 献给那些喂养我们的鸡

  • To the chickens who feed us

  • 感谢 Raith Sienar 和 Lucas 团队创造了名为“死星”的令人讨厌的创造物

  • To Raith Sienar and the Lucas team for that nasty creation called the Death Star

  • 莎士比亚和他批判性的、讽刺性的、引人入胜的故事

  • To Shakespeare and his critical, ironic, and captivating stories

  • 致龙的创造者、女战士、对我们随处可见的动漫大眼睛产生巨大影响的手冢治虫,以及制作了令人惊叹的蛋糕的宫本

  • To dragon creators, warrior women, Osamu Tezuka who greatly influenced those large anime eyes we see everywhere, and Miyamoto for ridiculously amazing cakes

  • 致模型制作者和从空中捕捉地球的人们

  • To the makers of maquettes and those who capture the earth from above

  • 致对我的现实影响极大的所有故事书制作者

  • To all manner of storybook makers who have greatly influenced my reality

  • 十七世纪折纸现象不断让我们睁开眼睛

  • To that seventeenth-century phenomenon known as origami that continues to fold open our eyes

An illustration of a robot with an inset and its robotic version on the right.

An illustration of a robot with an inset and its robotic version on the right.

图5-21

一张 1942 年的海报上提示“机器人正在尽最大努力减少幻觉”,标题是“如果缺席的虫子咬你……请告诉我们。”你们的劳动管理委员会。”迭代次数 = 111

Prompt of “robot doing their best to reduce their hallucination” accompanying a 1942 poster with the caption “If the Absentee Bugs Bite You… Tell Us. Your Labor Management Committee.” Iterations = 111

© 作者,获得 APress Media, LLC(Springer Nature 2023 一部分)的独家许可
P. Parra Pennefather利用生成式 AI设计思维进行创意原型设计https://doi.org/10.1007/978-1-4842-9579-3_6

6. 构建模块

6. Building Blocks

帕特里克·帕拉·雷柏1  
(1)
不列颠哥伦比亚大学,加拿大不列颠哥伦比亚省温哥华
 

经验丰富的原型创作者与即兴创作者有很多共同点。他们对尝试创造一些他们以前从未经历过的东西感到好奇。一些创意人员的任务是将这作为他们工作的一部分,为他们正在创造的产品或服务寻找新的市场。为了快速创作,他们依赖技能、技术和经验的结合,并习惯于“加强”他们可以用来创作作品的各种不同的软件应用程序。

Experienced creatives who prototype share a lot in common with improvisers. They are curious in their attempts to create something they haven’t experienced before. Some creatives are tasked to do this as part of their jobs, to find a new market for the product or service they are creating. To create rapidly they rely on a combination of skills, technique, and experience and are used to “ramping up” on a variety of different software applications that they can create their work with.

生成式人工智能是创意沙盒中的另一个工具。它构成了工具词典的一部分,可以用来激发、激发和使用作为达到目的的手段。生成式人工智能可以为各个学科的艺术家提供支持,让他们在与自己长期发展的技术的不断对话中学习如何使用它。视觉艺术家学习如何使用工具、概念和各种不同的媒体。他们还学习构图、视角、空间、形状等,并可以将这些应用到他们用人工智能生成的内容中。舞者学习如何有效地使用肌肉群,大多数人都有芭蕾舞基础,从钳子到圆形舞。音乐家依靠多种技巧来演奏乐器并学习旋律、和声和节奏。作曲家和即兴音乐家依靠手势技巧、标量结构、旋律、和声、不协和音等。

Generative AI is another tool in the sandbox for creatives. It forms part of a lexicon of tools to draw from to inspire, provoke, and use as means to an end. Generative AI can support artists from all disciplines who learn how to use it in constant dialogue with their own technique that they’ve developed over time. Visual artists learn how to use tools, concepts, and a variety of different media. They also learn about composition, perspective, space, shape, etc. and can apply these to the content they generate with an AI. Dancers learn how to use muscle groups efficiently, and most have a foundation in ballet from plier to rond de jambe. Musicians rely on several techniques to play their instruments and learn about melody, harmony, and rhythm. Composers and improvising musicians rely on gestural technique, scalar structures, melody, harmony, dissonance, etc.

但是可以应用哪些技术来利用人工智能生成原型内容呢?

But what techniques can be applied to generate prototypical content with an AI?

组件或构建块

Components or Building Blocks

每个创意人员在创作时都会涉及到一些固有的构建块。加法、减法和替换等构建块是有助于任何创作发展的基本组成部分。这些元素代表了创意人员用来创建、修改和完善其作品的核心流程。与人工智能交互时,除了人工智能提供的生成内容之外,构建块还可以应用于提示写作的探索。就生成的内容而言,创意人员可以辨别他们想要添加、删除、替换、转换、扩展、变化和迭代的内容。例如,当我即兴演奏钢琴时,我可以用右手弹奏一个音符并按住十秒钟。假设该音符是中间的 C。然后我可以弹奏它上面的音符 G,然后我可以重复该序列。我可以继续添加音符,直到获得一系列音符(C、G、E、Bb)。我也可以重复该模式,但用 A 替换 Bb,看看结果是什么。我可以通过给每个音符不同的持续时间来创建这些音符的变奏,从而创造变奏。然后,我可以通过添加左手弹奏低音 C 来增强复杂性。我还可以向上或向下移动音符以产生变化,并保持音符之间的距离相同。在这样做的过程中,我将序列向上调换。当使用单词而不是注释来提示生成人工智能时,可以应用类似的构建块。

There are inherent building blocks that every creative engages with when they create. Building blocks like addition, subtraction, and substitution are fundamental components that contribute to the development of any creation. These elements represent the core processes that creatives use to create, modify, and refine their work. When interacting with an AI, building blocks can be applied to the exploration of prompt writing, in addition to the generated content the AI offers. In terms of generated content, creatives discern what they want to add, take away, substitute, transform, scale, vary, and iterate upon. When I improvise at the piano, for example, I can play one single note with my right hand and hold it down for ten seconds. Let’s say that note is middle C. I can then play the note G above it, and then I can repeat that sequence. I can keep adding notes until I have a sequence of notes (C, G, E, Bb). I can also repeat the pattern but substitute the Bb with an A and see what the result is. I can create variations of just those notes by giving them each different durations and thus create variation. I can then build up the complexity by adding my left hand playing a low C. I can also shift the notes up or down to create variation, keeping the same distance between the notes. In doing so I’m transposing the sequence up. Similar building blocks can be applied when using words, instead of notes, to prompt a generative AI.

提示框中的构建块

Building Blocks in the Prompt Box

假设提示是基于文本的,则在创建该提示时应用构建块。这一过程可能涉及一个或多个构建块,许多创意人员在创作文本、图像、舞蹈、戏剧、音乐或视频时也会使用这些构建块。这些可以包括变化、加法和减法、替换、迭代、增加、减少、变换、转置以及调用和响应。如果您之前曾提示过 AI,那么您会对这些内容感到熟悉。每一个都可以提高任何生成人工智能的原型设计能力。

Assuming the prompt is text-based, building blocks are applied in the creation of that prompt. The process may involve one or more building blocks that many creatives also use when it comes to the composition of text, image, dance, theater, music, or video. These can include variation, addition and subtraction, substitution, iteration, augmentation, diminution, transformation, transposition, and call and response. If you have prompted an AI before, these will be familiar to you. Each can boost the prototyping capabilities of any generative AI.

早餐作为数据集

Breakfast as Data Set

为了找到跨学科的共鸣,烹饪食谱可以对其中一些构建模块进行建模(图6-1)。例如,我们可以演示不同成分在配方中的应用,以在原型设计时突出显示添加的构建块。将配方中的一种成分替换为另一种成分可以代表替代的基石。

An animated picture of an egg. It is a response to generate a warrior egg preparing for battle using A I. It has an eye, a nose, a mouth, and a sharp weapon.

An animated picture of an egg. It is a response to generate a warrior egg preparing for battle using A I. It has an eye, a nose, a mouth, and a sharp weapon.

图6-1

作者的鸡蛋照片,并附有提示“战士鸡蛋准备战斗”。迭代次数 = 29

Author’s photo of an egg accompanying the prompt “warrior egg preparing for battle.” Iterations = 29

作者早餐有哪些创意?

Author: What are some breakfast ideas?

缪斯
  • 菠菜和羊奶酪炒鸡蛋

  • Scrambled eggs with spinach and feta cheese

  • 混合浆果和蜂蜜的希腊酸奶

  • Greek yogurt with mixed berries and honey

  • 牛油果吐司配炒鸡蛋和烟熏三文鱼

  • Avocado toast with scrambled eggs and smoked salmon

  • 用酸奶、冷冻浆果和菠菜或羽衣甘蓝制成的冰沙碗

  • Smoothie bowl made with yogurt, frozen berries, and spinach or kale

  • 全麦华夫饼或薄煎饼,上面撒上新鲜水果和枫糖浆

  • Whole-grain waffles or pancakes topped with fresh fruit and a drizzle of maple syrup

当您与任何生成式人工智能交互时,您会激活重要的创造力,并开始建立自己独特的方法来实现越来越有用的结果。有意识地激活创意构建模块将帮助您充分利用生成式人工智能,将其作为您可能会发现有用的另一种工具。

As you interact with any generative AI, you activate important creative muscles and start to establish your unique approaches to achieve increasingly useful results. Activating creative building blocks intentionally will help you take best advantage of generative AI as another tool you may find useful.

变化

Variation

变异是生成式人工智能的目的。您可以通过简单地使用完全相同的文本、音频、视频或图像提示重新生成内容来创建不断变化的变化。虽然生成式人工智能会根据训练数据寻找模式来生成最佳结果,但它实际生成的结果是不可预测的。这是因为人工智能算法通过构建输入数据的概率模型来预测结果。他们分析数据中的模式和关系,以确定不同结果的可能性。可以预见的是,人工智能产生的不同结果是不同的。

Variation is what generative AI is built for. You can create variation upon variation by simply regenerating content with the exact same text, audio, video, or image prompt. While generative AI looks for patterns to generate the best result based on its training data, what it will actually generate is unpredictable. This is because AI algorithms predict outcomes by building probabilistic models of the input data. They analyze patterns and relationships within the data to determine the likelihood of different outcomes. Those different outcomes that AI generates are, predictably, different.

某些生成式人工智能不会为您提供足够的附加功能,而这些附加功能在启用后会在相同的文本提示下提供更丰富的变体。另一方面,重新表述提示可能会产生截然不同的内容,您可能会发现这些内容在原型设计过程中很有用。一个极端的例子是使用欺骗手段促使人工智能生成可能被标记为禁止的内容。为什么这样做也应该受到质疑,但有些人发现测试人工智能是他们创作过程的重要组成部分,可能会产生意想不到的反应,他们可以用这些反应来激发自己的创造力。然而,在开发更复杂的提示之前,看看当您使用完全相同的提示重新产生更多早餐创意时会发生什么是有用的。

Some generative AI will not provide you with enough additional features that when enabled provide even richer variations working off the same text prompt. On the other hand, rephrasing a prompt may result in wildly different content that you might find useful in your prototyping process. An extreme example is to use trickery to prompt an AI to generate content that may have been labeled prohibitive. Why you do this should also be questioned, but some people find that testing an AI is an important part of their creative process and may yield unanticipated responses that they can use to fuel their own creativity. Prior to developing more complex prompts, however, it is useful to see what happens when you use the exact same prompt to regenerate more breakfast ideas.

作者早餐有哪些创意?

Author: What are some breakfast ideas?

缪斯
  • 早餐卷饼,里面有炒鸡蛋、黑豆、鳄梨和莎莎酱

  • Breakfast burrito filled with scrambled eggs, black beans, avocado, and salsa

  • 隔夜燕麦由燕麦片、希腊酸奶、混合浆果和少许蜂蜜或枫糖浆制成

  • Overnight oats made with rolled oats, Greek yogurt, mixed berries, and a drizzle of honey or maple syrup

  • 藜麦或小米粥,上面撒上混合坚果、种子和新鲜水果

  • Quinoa or millet porridge topped with mixed nuts, seeds, and fresh fruit

  • 用全麦面包制成的法式吐司,上面撒上新鲜水果和糖粉

  • French toast made with whole-grain bread and topped with fresh fruit and a dusting of powdered sugar

如果您不喜欢结果或者该结果是幻觉,大多数生成式人工智能模型都会为您提供重试的机会。幻觉将在另一章中更详细地介绍。人工智能幻觉是人工智能算法在尝试从训练数据中重新创建模式时生成的无意识或意外的输出。想象一下幻觉,这是人工智能产生的一种反应,尽管听起来很有说服力,但它根本不准确、不真实,或者更糟糕的是,根本不好吃。这可能会导致不切实际的图像、无意义的文本或异常的音频序列。

Most generative AI models provide you with the opportunity to simply try again if you don’t like the result or if that result is a hallucination. Hallucinations will be covered in another chapter in more detail. AI hallucinations are unintended or unexpected outputs generated by an AI algorithm when it tries to recreate patterns from its training data. Consider a hallucination a response that an AI generates that, as convincing as it may sound, is simply inaccurate, untrue, or, worse, not tasty at all. This can result in unrealistic images, nonsensical text, or unusual audio sequences.

当谈到图像生成人工智能时,可用的变化似乎是无限的。其中一个例子是文本图像生成 AI Stable Diffusion,它一次为用户提供四种图像变体,供用户根据文本提示进行选择。出于多种原因,能够访问多种变体可能是有益的。例如,它允许您比较同一图像的不同版本,使您能够选择最适合您需求的版本。此外,它还可以帮助您避免创建重复的图像,这可能既耗时又无聊。

When it comes to image-generating AI, there is a seemingly limitless amount of variation available. One such example is the text-image generative AI Stable Diffusion, which provides users with four variations of an image at a time to choose from based on the text prompt. Having access to multiple variations can be beneficial for several reasons. For instance, it allows you to compare different versions of the same image, enabling you to select the best one for your needs. Additionally, it can help you avoid creating repetitive images, which can be time-consuming and boring.

查看多个变体可以帮助您探索创意过程的后续步骤。通过生成多个变体,您还可以通过将变体与其他构建块相结合来尝试不同的样式、效果和配色方案。这种方法对于艺术家、设计师和摄影师来说非常有用,他们可以在文本提示中附上自己创作的上传图像。生成式人工智能不仅可以向个人上传的照片、作品或原创艺术添加元素,而且许多还提供不断增长的滤镜库,从根本上改变它们的风格(图6-2)。许多生成式人工智能还提供放大功能,可以提高创意者上传的图像的质量和分辨率。

Reviewing multiple variations can support you in exploring the next steps your creative process will take. By generating several variations, you can also experiment with different styles, effects, and color schemes by combining variation with other building blocks. This approach can be useful for artists, designers, and photographers who can accompany a text prompt with an uploaded image of their own creation. Not only can generative AI add elements to a photo or work or original art that a person uploads, but many offer a growing library of filters to radically transform them stylistically (Figure 6-2). Many generative AI also offer the feature of upscaling that improves the quality and resolution of an image that a creative uploads.

A photograph and 2 animated versions of a garden. It is a response to generate fairies, heather, butterflies, and a pastoral scene using A I. The top one is a real photograph, the center one has butterflies, and the bottom one has fairies.

A photograph and 2 animated versions of a garden. It is a response to generate fairies, heather, butterflies, and a pastoral scene using A I. The top one is a real photograph, the center one has butterflies, and the bottom one has fairies.

图6-2

根据在玛斯昆族国地区 UBC Nitobe 纪念花园拍摄的原始照片(上)生成的两种变体,并带有文字提示,其中包括“仙女、石南花、蝴蝶、田园风光”。迭代次数 = 12

Two variations generated from an original photo (top) taken at Nitobe Memorial Garden, UBC, on Musqueam territory with text prompts that included “fairies, heather, butterflies, a pastoral scene.” Iterations = 12

上传您自己的照片并创建变体或添加新元素可以改变以前的工作,而使用图像编辑软件可能需要花费数小时的时间。在图6-2生成的图像中,请注意底部两张图像中水池的消失以及石南花和仙女的添加。

Uploading your own photos and creating variations or adding new elements can transform previous work in ways that might have taken many hours using image editing software. Within the images generated in Figure 6-2, notice the disappearance of the water pond in the bottom two images and the addition of heather and fairies.

然而,变化并不限于图像。大多数法学硕士可以重写您在提示框中放置的文本。写作技巧在这里发挥作用,因为您还可以要求风格上的变化。以下是一些可能导致生成各种文本的提示想法。许多创意人员还聘请法学硕士来生成文本提示,以用于文本图像、视频或音频生成人工智能
  • 重写并扩展300字。

  • Rewrite and extend for 300 words.

  • 用较少技术性的语言重写。

  • Rewrite with less technical language.

  • 为八岁的年轻观众重写。

  • Rewrite for a young audience of eight-year-olds.

  • 重写好像试图证明一个观点。

  • Rewrite as if trying to prove a point.

  • 像奥斯卡·王尔德一样重写。

  • Rewrite as if you are Oscar Wilde.

用种子进行更微妙的变化实验

Experimenting with Seeds for More Subtle Variations

文本图像生成人工智能还创建称为“种子”的独特数字标识符,以允许对图像进行轻微排列。生成的每个图像都是一个种子。该种子允许您继续调整与该种子关联的图像。每个种子都与特定生成的图像有数字关联。并非所有生成式人工智能都会免费提供这些种子,但它们在增强原型方面非常有价值,只需稍加改动即可更好地符合您的愿景。当您生成一张您真正喜欢的图像,但其中可能有您不喜欢的部分而难以删除时,这非常有用。您可以重新生成相同的种子图像,并在提示中通过添加、减去或替换提示中的一个或多个单词来稍微修改文本(图6-3)。

Text-image generative AI also create unique numeric identifiers known as “seeds” to allow for slight permutations of an image. Every image generated is a seed. That seed allows you to continue to tweak the image associated with that seed. Each seed has a numeric association with a specific generated image. Not all generative AI provide these seeds freely, but they can be valuable in enhancing prototypes with slight variations to better align with your vision. This is useful when you generate an image you really like, but there may be one part of it you don’t like that you are having difficulty removing. You can regenerate the same seed image and, in your prompt, slightly modify the text by either adding, subtracting, or substituting one or more words of your prompt (Figure 6-3).

Two animated pictures for a response to generate a virtual world inside the head with cute robots, profiles, hyper-realistic, 3D, hyper-detailed, unreal game engine. The left one is a female cyborg. The right one is a female robot.

Two animated pictures for a response to generate a virtual world inside the head with cute robots, profiles, hyper-realistic, 3D, hyper-detailed, unreal game engine. The left one is a female cyborg. The right one is a female robot.

图6-3

通过修改特定种子图像的提示来进行变体。左边,“女性机器人头部的虚拟世界,有可爱的机器人、轮廓、超现实、3D、超详细、虚幻的游戏引擎。”右边是“女性机器人头部的虚拟世界,有可爱的机器人、轮廓、超现实、3D、高度详细、虚幻的游戏引擎。” 450 个集合中的两次迭代

Variations by modifying the prompt of a specific seeded image. On the left, “virtual world inside head of female cyborg with cute robots, profile, hyper-realistic, 3D, hyper-detailed, unreal game engine.” On the right, “virtual world inside head of female robot with cute robots, profile, hyper-realistic, 3D, highly detailed, unreal game engine.” Two iterations from a collection of 450

加减

Addition and Subtraction

当与人工智能交互时,加法和减法的构建块会被快速应用,并揭示任何生成内容的可供性和局限性当您觉得缺少某些想要添加到文本、图像、音乐、视频或其他媒体的内容时,可以将添加构建块应用于提示。在提示中添加文本可能会促使人工智能生成不同的结果。然而,有时您会面临挑战,想要添加一些您的缪斯无法生成的仍然缺失的东西。根据您自己的技能和技术,您可能必须从缪斯手中接管控制权,并使用其他软件添加缺少的内容。这可能就像添加颜色一样简单,你根本无法让人工智能在 Photoshop 中生成或生成法学硕士拒绝创建的文本。

The building blocks of addition and subtraction are applied rapidly when interacting with an AI and reveal the affordances and limitations of any generative content. The building block of addition can be applied to a prompt when you feel something is missing that you’d like to add to text, image, music, video, or other media. Adding text to a prompt may provoke the AI to generate different results. However, sometimes you’ll be faced with the challenge of wanting to add something that is still missing that your muse cannot generate. Depending on your own skill and technique, you’ll likely have to take over control from your muse and add what is missing using other software. This may be as simple as adding colors you simply cannot get an AI to generate in Photoshop or generating the text that an LLM refuses to create.

当您觉得生成的内容中存在您想要删除的内容时,可以应用减法构建块。这可以是任何东西,例如生成的音乐片段开头的一层生成的声音对您不感兴趣,一个人的表现方式不符合您的审美或意图的图像,或者人工智能生成的文本你知道这是谎言或有偏见。您可以继续使用相同的提示生成内容,或者删除提示中可能导致 AI 添加您不需要的内容的单个单词。如果人工智能没有生成您想要的内容,那么您可能必须使用您熟悉的软件来启用您的技能和技术,例如 Photoshop 来裁剪图像或数字音频工作站来缩短人工智能生成的音轨,操纵某些使用均衡的频率,或将轨道用作多轨道合成中的一层。

The building block of subtraction can be applied when you feel something is present in generated content that you’d like to remove. This can be anything from a layer of generated sound in the beginning of a generated piece of music that doesn’t appeal to you, an image where how a person is represented doesn’t fit your aesthetic or intent, or text that an AI generates that you know is a lie or biased. You can continue to generate content with the same prompts or remove what might be a single word in your prompt that is causing the AI to add content you don’t want. If the AI is not generating what you want, then you may have to enable your skills and technique using software you are familiar with, such as Photoshop to crop an image or a digital audio workstation to shorten an AI-generated audio track, manipulate certain frequencies using equalization, or use the track as one layer in a multi-track composition.

我们可以使用加法工具来扩展文本-文本和文本-图像提示,或使用减法工具来缩短这些提示。这两种构建模块在创意产业中都很常见。每一个都可以应用于逐步开发想法和项目。使用法学硕士开发故事原型的作者可以通过逐渐构建他们的故事(一次一个场景)来使用加法。他们可能会从单个角色或场景开始,然后添加更多元素,如对话、冲突和解决方案,以创造丰富且引人入胜的叙事。相反,在收到反馈后,他们可能希望通过删除可能使其过于复杂的元素来重新生成故事的某些部分。

We can use the tool of addition to expand our text-text and text-image prompts or subtraction to shorten those prompts. Both building blocks are common across creative industries. Each can be applied to develop ideas and projects incrementally. Authors developing a prototype of a story using an LLM may use addition by gradually building up their story, one scene at a time. They might start with a single character or setting and then add more elements like dialogue, conflict, and resolution to create a rich and engaging narrative. Conversely, after receiving feedback they might want to regenerate some parts of the story by removing elements that might have made it overly complex.

视觉艺术中的添加可以指将新元素(例如颜色、形状、纹理或想法)融入到作品中。这个过程使艺术家能够尝试各种组件,将它们分层在一起以创造艺术品的深度和复杂性。视觉艺术家通常从简单的草图或概念开始,逐渐添加细节、层次和元素,以创造出更复杂、更具视觉吸引力的作品。例如,在绘画中,艺术家可能会想象一个基本的配色方案,当这些颜色在画布上显现并形成时,它们可能会激发新的冲动。艺术家直观或有意地改变、添加纹理并使用不同的颜色深浅来为不断发展的主题添加亮点。当使用文本图像生成人工智能时,提示的复杂性可能会增加。许多人工智能生成过于复杂的图像,需要删除特征。能够检查和审查生成的内容在很大程度上取决于艺术家为任何生成人工智能带来的技能和技术。在大多数情况下,艺术家的背景和经验将为虚拟画布本身的塑造提供信息,并产生更具吸引力和艺术性的结果,而生成式人工智能只能提供这些结果的开端。

Addition in visual art can refer to incorporating new elements, such as colors, shapes, textures, or ideas, into a composition. This process enables the artist to experiment with various components, layering them together to create depth and complexity in the artwork. Visual artists often begin with a simple sketch or concept and gradually add details, layers, and elements to create a more complex and visually appealing piece. In painting, for instance, an artist may imagine a basic color scheme, and as those colors manifest and take form on a canvas, they may inspire new impulses. Intuitively or intentionally artists vary, add textures, and use different shades of a color to add highlights to an evolving subject. When using a text-image generative AI, prompts can increase in their complexity. Many AI generate overly complex images that require features to be removed. Being able to inspect and review what is generated is highly dependent on the skill and technique that an artist brings to any generative AI. In most situations the background and experience of an artist will inform the shaping of the virtual canvas itself and result in far more compelling and artistic results that generative AI can only offer the beginnings of.

在音乐中,即兴钢琴家可以通过从单个音符开始,以不同的节奏重复该音符,通过组合这些音符开发出简单的旋律,分层低音线和和声来创作更丰富的音乐,从而自发地创作出一首作品。在不断发展的作品中添加和减去单个音符的简单性对于专注于声音生成的生成式人工智能来说要提示要困难得多。虽然它们可能擅长模仿特定艺术家的说唱的重复循环,但在生成的音乐中分离曲目或提示人工智能删除或减去您在整体声音中不想要的生成层要困难得多作品。这种艺术冲动需要定制的机器学习模型和大量的音频编辑,除非乐器轨道已经可分离,否则可能仍然无法减去您不希望在作品中出现的元素。参考我们原来的早餐提示,我们可以添加文本“菠菜和羊奶酪”来产生新的想法。

In music, improvising pianists can spontaneously generate a composition by just starting with a single note, repeating that note with varied rhythms, developing a simple melody by combining those notes, layering in a bass line and harmonies to create a richer piece of music. The simplicity of adding and subtracting individual notes in an evolving composition is much more difficult to prompt with generative AI that are focused on sound generation. While they may be good at emulating a repetitive loop for a rap that emulates a particular artist, it is much more difficult to separate tracks in generated music or prompt an AI to remove or subtract generated layers that you don’t want in the overall sound composition. That artistic impulse requires a customized machine learning model, and a lot of audio editing that may still not be able to subtract elements you don’t want present in the composition unless instrument tracks are already separable. Referring to our original breakfast prompt, we can add the text “with spinach and feta cheese” to generate new ideas.

作者菠菜和羊奶酪的早餐有哪些创意?

Author: What are some breakfast ideas with spinach and feta cheese?

Muse这里有一些选择:

Muse: Here are some options:

菜肉馅煎蛋饼:菜肉馅煎蛋饼是一种意大利煎蛋卷,可以填充多种配料。制作菠菜和羊乳酪菜肉馅煎蛋饼时,将菠菜放入平底锅中炒至变干;然后将其添加到打散的鸡蛋和碎羊奶酪的混合物中。在烤箱或炉灶上煮菜肉馅煎蛋饼直至凝固。

Frittata: A frittata is a type of Italian omelette that can be filled with a variety of ingredients. To make a spinach and feta frittata, sauté the spinach in a pan until wilted; then add it to a mixture of beaten eggs and crumbled feta cheese. Cook the frittata in the oven or on the stovetop until set.

鸡蛋烘焙:鸡蛋烘焙是将面包、鸡蛋、菠菜和羊奶酪分层放在烤盘中并在烤箱中烘烤而制成的菜肴。您可以使用任何类型的面包,也可以添加其他配料,例如西红柿丁、青椒或蘑菇。

Egg Bake: An egg bake is a dish made by layering bread, eggs, spinach, and feta cheese in a baking dish and baking it in the oven. You can use any type of bread, and you can also add other ingredients such as diced tomatoes, bell peppers, or mushrooms.

煎鸡蛋:煎鸡蛋是用菠菜和羊乳酪准备鸡蛋的另一种简单方法。只需将菠菜放入锅中炒至枯萎,然后在菠菜中心挖一个洞,然后将鸡蛋打入其中即可。上面放上碎羊奶酪,煮至蛋白凝固,但蛋黄仍然流淌。

Fried Eggs: Fried eggs are another simple way to prepare eggs with spinach and feta. Simply sauté the spinach in a pan until wilted, then make a hole in the center of the spinach, and crack an egg into it. Top with crumbled feta cheese and cook until the egg whites are set but the yolks are still runny.

炒鸡蛋:你也可以用菠菜和羊乳酪做炒鸡蛋。只需将菠菜放入锅中炒至枯萎,然后加入打散的鸡蛋,煮至凝固即可。上菜前在上面撒上碎羊奶酪。

Scrambled Eggs: You can also make scrambled eggs with spinach and feta. Simply sauté the spinach in a pan until wilted, then add in whisked eggs, and cook until set. Top with crumbled feta cheese before serving.

当生成图像的变体时,我们还会激活加法构建块,如图6-3中生成的图像。

We also activate the building block of addition when generating a variation of an image, such as the ones generated in Figure 6-3.

您也可以添加和减去您创建的图像的元素。图6-4显示了车道上一栋房屋的原始照片,其中有车辆和人员。提示应用了添加的构建块,将石南花添加到车道上,并将房屋放置在森林中。负面提示带走了图像中我们不想要的某些部分,在本例中是原始照片中的人和车辆。在某些生成人工智能中,您还可以添加键盘命令,例如使用括号来强调提示中您希望人工智能注意的单词,或者在文本图像提示中使用双括号以确保某些图像不会出现在生成的图像中。

A photograph and an animated version. It is a response to remove elements from the original photo and add elements instead of it using an anime-style filter. The top one is a real photograph and the bottom one is the animated one.

A photograph and an animated version. It is a response to remove elements from the original photo and add elements instead of it using an anime-style filter. The top one is a real photograph and the bottom one is the animated one.

图6-4

修复以删除或减去作者原始照片的元素,然后用其他元素替换,并使用动漫风格滤镜。迭代次数 = 4

Inpainting to remove or subtract elements of an original photo by the author, then substituting with others, and using an anime style filter. Iterations = 4

代换

Substitution

创意人员经常应用替换。当将其应用于生成人工智能时,替代可能表现良好,也可能表现不佳。例如,您可能只想以特定艺术家的风格创建不同的对象,但无法获得一致的结果。在生成人工智能方面,替代是时好时坏的,因此坚持不懈并创造出多种变化是值得的。为此,如果替换的单词或短语给您带来不符合您的艺术视野的东西,您可能还必须牺牲您的一些愿景或意图。提示文本图像生成AI时也可以应用替换(图6-5)。

Creatives regularly apply substitution. Substitution may or may not behave well when it comes to applying it to a generative AI. For example, you may just want to create different objects in the style of a specific artist and not receive consistent results. Substitution is hit and miss when it comes to generative AI, so it pays to be persistent and create a number of variations. To that end you may also have to sacrifice some of your vision or intent if the substituted word or phrase gives you something that does not fit your artistic vision. Substitution can also be applied when prompting text-image generative AI (Figure 6-5).

Two animated pictures. It is a response to generate a photo-realistic image of a tree on the left and a photo-realistic image of a flower on the right.

Two animated pictures. It is a response to generate a photo-realistic image of a tree on the left and a photo-realistic image of a flower on the right.

图6-5

在提示中使用左侧的“树的真实图像”和右侧的“花的真实图像”的替换。单个生成的图像

Using substitution in a prompt with the figure on the left “photo-realistic image of a tree” and the right “photo-realistic image of a flower.” Single generated images

请记住,每个生成的报价都是一个实验,可能会结合本章中强调的一些构建块。参与生成式人工智能就是与你的缪斯不断地来回对话;它正在理解可用的数据集,并试图找出(通过特定的匹配学习模型)如何充分利用您的提示来生成您可以使用的东西。

Remember that every generated offer is an experiment that will likely combine some of the building blocks highlighted in this chapter. Engaging with generative AI is a constant back-and-forth conversation with your muse; it is making sense of the data set made available to it and trying to figure out (through a specific matching learning model) how to make the most sense of your prompting to generate something you can use.

遮罩以替换图像的一部分

Masking to Substitute Parts of an Image

替代也可以是某些生成人工智能提供的功能。它们提供了进一步的灵感,即掩盖图像的某些区域的能力,并提示删除或替换图像的一部分以代替现有的图像(图6-6)。蒙版可让您识别要更改的图像的特定部分。一些生成式人工智能附带不同的画笔或其他工具来帮助您这样做。这可能是更改您生成的图像的一小部分及其所属种子的更有效方法。您还可以使用人工智能删除照片或您所做的其他创意作品的部分内容。

Substitution can also be a feature that is offered with some generative AI. They offer further inspiration in their capacity to mask certain areas of an image combined with a prompt to remove or substitute part of an image in place of what already exists there (Figure 6-6). Masks allow you to identify a certain part of an image that you want to change. Some generative AI come with different brushes or other tools to assist you in doing so. This may be a more effective way to change a small part of an image that you’ve generated and the seed in which it is a part. You can also use AI to take away parts of a photo or other creative work you have made.

Two animated pictures. It is a response to remove the cat from the portrait by using a mask in A I and substituting it with a dog. A masked cat face is present on the left and an edited photo with a dog face using Photoshop is present on the right.

Two animated pictures. It is a response to remove the cat from the portrait by using a mask in A I and substituting it with a dog. A masked cat face is present on the left and an edited photo with a dog face using Photoshop is present on the right.

图6-6

使用人工智能中的面具将猫从肖像中移除,并附有文字提示,简单地说明“用狗代替”。这是一个复杂的过程,涉及一张猫的公共领域照片,其中包含超过 230 次猫的迭代和 40 次狗的迭代。然后使用蒙版在 Photoshop 中对右侧的照片进行了高度编辑

Using a mask in an AI to remove the cat from the portrait accompanied with a text prompt simply stating “substitute with dog.” A complex process involving a public domain photo of a cat with over 230 iterations of the cat and 40 of the dog. The photo on the right was then highly edited in Photoshop using masks

迭代

Iteration

你可以利用变化、加法、减法和替换的构建模块走得更远。每次您完善提示以及提示对您可能提供的图像的影响程度时,您都在继续改进生成的内容,使其对您自己的创作过程有用。您创建文本、图像或其他媒体的迭代或版本,并且该迭代始终会有所不同,具体取决于生成式 AI 提供的功能,包括您希望原始生成内容更改的程度。用原型设计的语言来说,当你与人工智能打交道时,你可以以不同的形式创建不同版本或不同形式的想法表达。在许多情况下,您会逐步迭代内容,使其更接近您的设想。

You can go far with the building blocks of variation, addition and subtraction, and substitution. Every time you refine your prompts and the degree to which a prompt will affect an image you might offer, you are continuing to improve the generated content to make it useful to your own creative process. You create an iteration or a version of your text, image, or other media, and that iteration will always be different depending on the features a generative AI offers including the degree to which you want your original generated content to change. In the language of prototyping, you create different versions or representations of an idea in different forms when you engage with an AI. In many cases you move toward an iteration of your content that gets closer to what you envision.

粗略的原型可以生成为想法的要点列表,也可以是纸上的草图、猫的涂鸦、餐巾纸上的想法,或者只能用笔写在手上才能捕捉到的东西。当您开始开发最初的粗糙原型时,纸上的草图可以成为您制作的令人惊叹的绘画,然后在艺术画廊的开幕式上展示。该涂鸦变成了您与同事一起开发的视频游戏中的 3D 角色。餐巾纸上的想法变成了一个成熟的商业计划。你写在手上的那句话成为你创作的一首歌曲的副歌歌词,然后与你的乐队一起录制。

Rough prototypes can be generated as bullet point lists of ideas and can also be sketches on paper, a doodle of a cat, an idea on a napkin, or something you could only capture by writing it on your hand with a pen. When you start to develop your initial rougher prototypes, then the sketch on paper can become an amazing painting that you craft and then show at an art gallery opening. That doodle becomes a 3D character in a video game you are working on with colleagues. The idea on a napkin becomes a full-blown business plan. That sentence you wrote on your hand becomes the lyrics for a chorus of a song you came up with and then record with your band.

每一个简单、粗糙的原型,无论它是什么,都有可能提高保真度、分辨率和复杂性。生成式人工智能将支持您创建有针对性的内容,然后您可以对其进行完善。当您与缪斯互动时,您会意识到您的想法是由想法发展的早期阶段决定的。如果您将缪思返回给您的内容重新想象为您现在可以使用的粗略原型,则尤其如此。例如,您可以从法学硕士身上汲取想法,然后查看并编辑生成的文本内容。您的缪斯还可以生成图像、音乐、视频、代码、动画。尽管不完美或不完整、有问题、尴尬或不完全是您想要的,但生成的报价是构成正在进行的创意对话的一部分(图6-7)。

Every single, simple, rough prototype no matter what it is has the potential to increase in fidelity, resolution, and complexity. Generative AI will support you in creating targeted content that you can then refine. As you interact with your muse, you come to the realization that your ideas are informed by the earlier phases of idea development. This is especially true if you reimagine what your muse gives back to you as a rough prototype that you can now work from. You can bounce ideas off your LLM, for example, and review and then edit generated text content. Your muse can also generate images, music, videos, code, animation. Although imperfect or incomplete, buggy, awkward, or not exactly what you want, what is generated is an offer that forms part of an ongoing creative conversation (Figure 6-7).

Four animated pictures of an uncanny doll in a 2 by 2 pattern. Each version has some changes like different helmets, eyes, and costumes.

Four animated pictures of an uncanny doll in a 2 by 2 pattern. Each version has some changes like different helmets, eyes, and costumes.

图6-7

根据 1940 年的一张不可思议的娃娃的公共领域照片,有四个版本的可爱复仇女神雕像可供选择。迭代次数 = 235

Four versions of a cute nemesis figurine to choose from based on a public domain photo of an uncanny doll from 1940. Iterations = 235

有时,生成式人工智能给你带来的迭代会是不可思议的、尴尬的和奇怪的,但你也可能选择使用它。这些生成式人工智能的开发人员也可能会投入时间和资源来迭代算法和数据集,因此预计随着时间的推移,您将获得更有趣的结果和平台,为您提供更多功能,以创建更多版本的预期输出。图6-8已生成为3D 图像原型,以预先可视化它作为物理折纸雕像的外观。

An animated picture of a paper figurine. It is generated after the application of the Origami filter. It looks like a doll.

An animated picture of a paper figurine. It is generated after the application of the Origami filter. It looks like a doll.

图6-8

折纸过滤器应用于复仇女神雕像提示,作为种子的 235 次迭代之一,以激发纸雕像的早期物理原型

Origami filter applied to the nemesis figurine prompt as one of 235 iterations of a seed to inspire an early-phase physical prototype of a paper figurine

增强

Augmentation

增强是一种用于放大某物或使某物更长或更宽的技术。想象一下您创建的画布,并且您想要预先想象如果可以通过内容、手势、场景和/或角色来补充您现有作品的长度或高度或两者扩展,它会是什么样子。生成式人工智能可以帮助您将其可视化。添加新元素怎么样?这也是有可能的。

Augmentation is a technique used to enlarge something or make something longer or wider. Imagine a canvas that you have created, and you wanted to pre-visualize what it might be like if it could be extended in length or height or both with content, gestures, scenes, and/or characters that complemented your existing work. Generative AI can support you in visualizing that. What about adding new elements? That too is possible.

借助一些生成式人工智能,这可以通过称为“outpainting”的过程来实现。这是深度学习中的一项生成任务,其中训练神经网络以生成超出作为输入数据提供的原始图像的新内容。它通常用于扩展或完成给定的输入图像、声音、视频或文本。这带来了令人兴奋的可能性。例如,您可以通过使用图像中视觉上已经存在的元素来扩展生成的图像,从而使整个画布更大。 OpenAI、Midjourney、Photoshop 和其他平台的 DALL-E 2 可以实现此功能。

With some generative AI, this can be achieved through a process called “outpainting.” It is a generative task in deep learning where a neural network is trained to generate new content beyond the original image that was provided as input data. It is often used to extend or complete a given input image, sound, video, or text. That allows for exciting possibilities. For example, you could extend an image that you generate by using elements already present visually in the image making the entire canvas bigger. This feature is possible with DALL-E 2 by OpenAI, Midjourney, Photoshop and other platforms.

MuseDALL·E 2 是 GPT(生成式预训练变压器)-3 架构的 120 亿参数版本,已针对绘制任务进行了微调。给定描述所需图像或视频的提示,它会生成超出输入提示边界的相应新颖视觉内容。绘制任务要求模型理解输入提示的上下文,识别需要生成的内容,并创建连贯且真实的新内容

Muse: DALL·E 2 is a 12-billion-parameter version of the GPT (Generative Pretrained Transformer)-3 architecture that has been fine-tuned for the outpainting task. Given a prompt describing a desired image or video, it generates a corresponding novel visual content that goes beyond the boundaries of the input prompt. The outpainting task requires the model to understand the context of the input prompt, identify what needs to be generated, and create new content that is both coherent and realistic.

外画可以被视为您自己视觉的创造性延伸,根据图像和文本输入生成新的视觉内容。模型生成的输出可用于各种应用,例如生成新的插图、绘画、动画等。以下面的数字集合为例。图6-9中的上图是人工智能想象作者在图书馆写一本关于人工智能的书的样子。中间和底部的图像是图书馆的扩展,其中包括也可能在图书馆中读写的其他人工智能。

A 3-part animated picture. The top picture represents an author writing a book on A I in a library. The middle and bottom pictures include one and two A Is, respectively, along with the author who may be reading and writing in a library. All are generated using A I.

A 3-part animated picture. The top picture represents an author writing a book on A I in a library. The middle and bottom pictures include one and two A Is, respectively, along with the author who may be reading and writing in a library. All are generated using A I.

图6-9

图 1 被输入 DALL-E 2 并使用名为“outpainting”的功能进行增强,该功能扩展画布以显示机器人在带有一些自主 AI 的库中书写

Figure 1 fed into DALL-E 2 and augmented using the feature called “outpainting,” which extends the canvas to show the robot writing in a library with a few autonomous AI

缩小

Diminution

缩小是一种与增强相反的技术,旨在缩小手势的大小(例如,在音乐短语中),或者在生成内容的情况下,删除图像中您不想要的内容或更改比例或大小画布的。大多数图像生成人工智能更进一步,允许用户决定他们不想要图像中的内容。当该图像附有文本提示中的说明时,即可实现这一点。通过识别生成的图像中您不想要的内容,您可以引导您的灵感产生更理想的结果。例如,假设您想要一张高质量的人物照片,但不希望它看起来不切实际或扭曲。通过添加“不切实际”、“扭曲”或“额外的手指”等负面关键词,您可以引导生成式人工智能生成更逼真、更准确的图像。负面提示对于完善你的艺术风格也很有用。如果您对缪斯生成的结果不满意,请尝试添加一些否定关键字以引导其生成更符合您的艺术愿景的内容。此外,如果您想要裁剪图像的所有部分(除了一个部分),可以将缩小应用于生成的图像。之后,您可能会提示图像-图像人工智能进一步迭代该部分或升级。我们不可避免地会期望生成人工智能图像平台添加新功能,允许用户放大图像并附带文本提示。

Diminution is a technique opposite to augmentation and meant to reduce the size of a gesture (e.g., in a musical phrase) or, in the case of generated content, remove something in the image you don’t want or change the scale or the size of the canvas. Most image generative AI take it a step further by allowing users to decide what they don’t want in an image. This is achieved when that image is accompanied by instructions in a text prompt. By identifying what you don’t want in your generated image, you can guide your muse toward producing more desirable results. For example, let’s say you want a high-quality photograph of a person, but you don’t want it to look unrealistic or distorted. By adding negative keywords like “unrealistic,” “distorted,” or “extra fingers,” you can steer the generative AI toward producing a more lifelike and accurate image. Negative prompts can also be useful in refining your artistic style. If you’re not satisfied with the results your muse generates, try adding some negative keywords to guide it toward producing something more aligned with your artistic vision. Additionally, diminution can be applied to a generated image should you want to crop all parts of that image except one. After, you might prompt an image-image AI to further iterate on that one part or upscale it. Inevitably we can expect new features to be added to generative AI image platforms that will allow users to zoom in to an image with an accompanying text prompt.

换位

Transposition

转置的构建块有多重含义。从音乐学科来看,变调与改变乐曲或歌曲的音调更高或更低有关。就生成式人工智能而言,换位是指将一种生成式人工智能中生成的内容用于另一种生成式人工智能中的过程。这里有些例子:
  • 在法学硕士中生成文本,然后在文本语音生成人工智能中使用这些单词。

  • Generating text in an LLM and then using those words in a text-speech generative AI.

  • 使用源图像生成新图像,然后使用该新图像生成下一个图像。这种图像中图像的换位可以揭示许多有趣的惊喜。

  • Using a source image to generate a new image and then using that new image to generate the next image. This type of image-in-image transposition can reveal many interesting surprises.

  • 从一个 AI 到另一个 AI 使用相同的文本提示。这是测试多个生成人工智能平台的数据集的好方法,特别是对于它们可能存在的任何偏差。更好的方法是针对不同生成 AI 的相同提示对 100 张生成的图像进行测试。

  • Using the same text prompt from one AI to another. This is a great way to test data sets of multiple generative AI platforms, particularly for any biases they might have. Even better would be to conduct a test of 100 generated images for the same prompt across different generative AI.

  • 使用用于生成图像的文本提示来伴随完全不同的图像。这可以产生有趣的惊喜。

  • Using a text prompt that was used to generate an image to accompany a completely different image. This can yield interesting surprises.

最后一点可以在下面的示例中详细说明。

This last point can be detailed in the example that follows.

生成的一系列戴着兜帽、披着斗篷、手持光剑的猫经历了数百次迭代,将提示从一个生成人工智能添加、减去和转置到另一个生成人工智能,以获得如图6-10所示的所选变体。有时猫没有胡须。其他时候,兜帽遮住了整个头部。进行多次迭代具有战略意义,因为它还可以让您完善提示和负面提示。在图6-10的情况下,提示至少在提示开始时经历了以下迭代。每个提示描述后面都附有风格参考:
  • 穿着斗篷、拿着光剑的猫

  • Cat wearing a cloak and holding a light sabre

  • 头上披斗篷的猫拿着光剑

  • Cat with a cloak over its head holding a light sabre

  • 穿着斗篷的猫拿着光剑

  • Cat in a cloak holding a light sabre

  • 穿着连帽衫、拿着光剑的猫

  • Cat in a hoody with a light sabre

  • 拿着光剑的僧帽猫

  • Cat in a monk’s hood holding light sabre

  • 披着连帽斗篷、拿着光剑的猫

  • Cat with a light sabre in a hooded cloak

  • 戴着光剑的猫

  • Cat in a hood with a light sabre

An animated picture of a cat. It is a response to generate a cat in a hood with a lightsaber using A I.

An animated picture of a cat. It is a response to generate a cat in a hood with a lightsaber using A I.

图6-10

使用猫的开源图像对文本提示进行 70 次迭代后提示“cat in a hood with a light sabre”的结果

The results for the prompt “cat in a hood with a light sabre” after 70 iterations of the text prompt using an open source image of a cat

尽管需要添加“肖像镜头”和“全身”等词语以确保至少显示猫身体的一部分,而不仅仅是头部,但最终的提示产生了最佳结果

The final prompt produced the best results even though words like “portrait shot” and “full body” needed to be added to ensure at least part of the cat’s body was shown, not just the head.

由于结果并不完全是我想要的,我决定在下一组提示中强调光剑,并在具有特定风格过滤器的图像-图像 AI 中使用图6-10 。结果如图6-11所示,除了我选择了黑白风格的滤镜之外,它在所有方面都非常接近。到目前为止,这是我能够达到预期提示的最接近的结果,但我决定现在在同一个图像-图像生成人工智能中使用该图像,看看我会得到什么

A greyscale animated picture of a cat. It is a response to generate a cat in a hood, holding a lightsaber.

A greyscale animated picture of a cat. It is a response to generate a cat in a hood, holding a lightsaber.

图6-11

图6-10中的猫被输入图像-图像AI,提示“带光剑的猫,全身,黑白”

The cat in Figure 6-10 is fed into an image-image AI with prompt “cat holding a light sabre in a hood, full body, black-white”

6-11 生成结果尽管图6-12中的猫是用两只爪子生成的,但我还是选择了这个图像。此外,猫的渲染还需要额外的步骤,这意味着 GAN 在渲染最终的猫之前能够来回 17 次。

A hyper-realistic, greyscale animated picture of a cat. It is a response to generate a cat in a hood, holding a lightsaber.

A hyper-realistic, greyscale animated picture of a cat. It is a response to generate a cat in a hood, holding a lightsaber.

图6-12

图6-11在图像-图像 AI 中提示“猫拿着光剑,全身,超写实,彩色”,结果是一只长着两只右爪子的不同样子的猫

The prompt “cat holding a light sabre in a hood, full body, hyper-realistic, color” accompanied Figure 6-11 in an image-image AI resulting in a different-looking cat with two right paws

图6-12中的猫随后被用于图像-图像 AI 中,激发了一种新想法,即生成穿着连帽斗篷、携带点燃的灯笼而不是光剑的猫。在图像附带的提示中,“光剑”一词被“灯笼”一词取代。完整的提示内容如下:“穿斗篷的猫携带灯笼超现实虚幻引擎 3d 黑白”。添加了一些风格过滤器,但很难获得所需的外观和感觉。如图6-13所示,没有灯笼,猫眼有些诡异;这只猫正处于某种令人毛骨悚然的墓地里,头巾现在已经脱落了。此时,您将有多种选择,包括使用图6-13(如果您碰巧喜欢它)。您还可以像我一样使用否定提示,以确保在人工智能下次生成尝试后不会出现图像的特定部分

A greyscale animated picture of a cat. It is a response to generate a cat in a cloak, carrying a lantern hyper-realistic unreal engine.

A greyscale animated picture of a cat. It is a response to generate a cat in a cloak, carrying a lantern hyper-realistic unreal engine.

图6-13

提示“披斗篷的猫提着灯笼超现实虚幻引擎3D黑白”生成了许多风格各异、不带灯笼的猫

The prompt “cat-in-a-cloak-carrying-a-lantern-hyper-realistic-unreal-engine-3D-black-and-white” generated a number of cats without a lantern in various styles

图6-13的持续迭代包括添加“鹅卵石街道”一词,并在特定的生成式 AI 平台中使用括号强调某些词。图6-14是经过二十多次迭代后的结果

A 3 D animated picture of a cat. It is a response to generate a cat in a cloak carrying a lantern cobblestone path hyper-realistic unreal engine.

A 3 D animated picture of a cat. It is a response to generate a cat in a cloak carrying a lantern cobblestone path hyper-realistic unreal engine.

图6-14

使用提示“cat-in-a-cloak-carrying-a-lantern-cobblestone path-hyper-realistic-unreal-engine-3D-black-and-white-with Yellows”的结果

The result of using the prompt “cat-in-a-cloak-carrying-a-lantern-cobblestone path-hyper-realistic-unreal-engine-3D-black-and-white-with yellows”

沉淀在图6-14中斗篷猫的图像上,该图像集合被搁置并接受了新的图像。图6-15是电气实验室中猴子进行数十次迭代的结果,将文本提示和图像从一种生成式 AI 转换为另一种。

A 3 D animated picture of a monkey. It is a response to generate a hyper-realistic monkey in an electric lab, an unreal game engine, cute, and cuddly.

A 3 D animated picture of a monkey. It is a response to generate a hyper-realistic monkey in an electric lab, an unreal game engine, cute, and cuddly.

图6-15

《电力实验室里的猴子,超写实,3D,虚幻游戏引擎,萌萌哒》第37次迭代

The 37th iteration of “monkey in an electric lab, hyper-realistic, 3D, unreal game engine, cute and cuddly”

在这个创作关头,换位法意外地发挥了作用。之前的提示“披着斗篷的猫带着灯笼鹅卵石小路超现实虚幻引擎3D黑白黄”伴随着图6-15,电气实验室中的猴子,产生了新的形象。当你尝试生成式人工智能时,值得注意的是,你最终可能会得到比你想象的更好的东西,尤其是当你接受错误或意外内容时

At this creative juncture, transposition came into play by accident. The previous prompt of “cat-in-a-cloak-carrying-a-lantern-cobblestone-path-hyper-realistic-unreal-engine-3D-black-and-white-with-yellows” accompanied Figure 6-15, the monkey in an electric lab, resulting in a new image. What is remarkable when you experiment with generative AI is that you may end up with something better than what you imagined, especially when you embrace mistakes or unintended content.

A hyper-realistic 3 D animated picture of a cat. It is a response to generate a cat in a cloak carrying a lantern cobblestone path unreal engine.

A hyper-realistic 3 D animated picture of a cat. It is a response to generate a cat in a cloak carrying a lantern cobblestone path unreal engine.

图6-16

在图像-图像 AI 中使用图6-15的结果以及提示“cat-in-a-cloak-carrying-a-lantern-cobblestone-path-hyper-realistic-unreal-engine-3d-black-and” -白黄相间

The result of using Figure 6-15 in an image-image AI along with the prompt “cat-in-a-cloak-carrying-a-lantern-cobblestone-path-hyper-realistic-unreal-engine-3d-black-and-white-with-yellows

提示与响应

Prompt and Response

生成式人工智能的另一个常见特征是音乐中常见的构建模块。

Another building block that is a familiar feature of generative AI is commonly practiced in music.

在不同的音乐传统中,“呼唤和回应”的实践是指一系列不同的音乐或节奏手势,其中第二个手势被视为对第一个手势的直接评论或响应。这种情况经常发生在非洲音乐、爵士乐和布鲁斯音乐中,但也可以在许多其他音乐流派中找到。 “呼叫”可以是一位音乐家演奏的任何定义的或即兴的手势,而“响应”可以是另一位音乐家或一组音乐家的直接重复、变化或新手势形式的答案。

In different musical traditions, the practice of “call and response” refers to a succession of two distinct musical or rhythmic gestures where the second gesture is heard as a direct commentary on or response to the first. This happens often in African music, jazz, and blues but can be found across many other musical genres. The “call” can be any defined or improvised gesture played by one musician, and the “response” can be a direct repetition, a variation, or an answer in the form of a new gesture from another musician or group of musicians.

在生成人工智能的背景下,“呼叫”对应于用户的输入或提示。这可以是向 ChatGPT-4 等语言模型发出的特定提示,也可以是作为起点提供给人工智能的图像或视频。

In the context of generative AI, the “call” corresponds to the user’s input or prompt. This could be a specific prompt given to a language model like ChatGPT-4 or an image or video provided to an AI as a starting point.

响应”对应于生成式人工智能产生的输出。此输出直接受到用户输入的影响,但这一次,报价是由人工智能创建的。例如,法学硕士根据给定的提示生成文本,或者音乐生成人工智能可能会根据输入片段的灵感创作一首新的音乐。

The “response” corresponds to the output produced by the generative AI. This output is directly influenced by the user’s input, but this time, the offer is created by the AI. For example, an LLM generates text that is based on the given prompt, or a music-generating AI might create a new piece of music inspired by the input piece.

这种交互允许动态和迭代的过程,就像音乐一样。用户可以根据人工智能的“响应”调整他们的“呼叫”,从而实现可以带来意想不到的创造性结果的来回。通过这种方式,用户和人工智能进行协作,共同为最终的原型做出贡献。

This interaction allows for a dynamic and iterative process, much like in music. The user can adjust their “call” based on the AI’s “response,” enabling a back-and-forth that can lead to unexpected and creative results. In this way, the user and the AI collaborate, each contributing to the final prototype.

对于法学硕士,您可以用文本提示它,然后它会生成内容作为对您提示的响应。如果您不喜欢第一个内容,您可以重新生成更多内容,或者根据生成的内容细化提示。人工智能生成的内容是出乎意料的,它会告诉你如何回应人工智能。这种循环的对话会产生一个你可能不必开始的想法。图6-17中的图像是代表相同提示“手握大脑的机器人”的拼贴画,在不同的文本图像生成人工智能中进行转换,并进行细化以更接近底部手握大脑的机器人的实际图像正确的。除了文字提示之外,在第一个生成的图像(其本身是由作者伸出的手和手臂的照片提示)之后,所有提示都包含了生成的图像。当我们收到越来越有用的原型时,我们与人工智能的互动越多,我们提示它们的方式就越改进。

Six animated pictures of a robot in a 3 cross 2 pattern. It is a response to generate a robot with the brain in hand. Each picture has some variations in the placement of the brain. The bottom 2 pictures have a hand and brain only.

Six animated pictures of a robot in a 3 cross 2 pattern. It is a response to generate a robot with the brain in hand. Each picture has some variations in the placement of the brain. The bottom 2 pictures have a hand and brain only.

图6-17

提示内容为“手握大脑的机器人”以及作者手臂和手的照片。总迭代次数=346

Prompts with variations of “a robot with brain in hand” along with a photo of the author’s arm and hand. Total iterations =346

当您开始与生成人工智能缪思互动时,您将开始开发自己的构建块、策略和方法来生成原型,这将有助于您自己的创作过程。这样做会让你内心的创造者向许多不同的方向旋转,而这可能正是你摆脱创意障碍所需要的(图6-18)。可以生成许多原型,并且在与缪斯互动的过程中您会发现无尽的想法。事实上,仅仅与一个人接触就会加速你自己的创作过程,并扩展你的想象力,让你考虑以前可能从未想到过的新想法。

As you begin to interact with your generative AI muse, you will begin to develop your own building blocks, strategies, and methods to generate prototypes that will be helpful to your own creative process. Doing so will spin your inner creator in many different directions, and that might be just what you need to get out of a creative block (Figure 6-18). Many prototypes are possible to generate, and you will find endless ideas in the act of engaging with your muse. In fact, just engaging with one will accelerate your own creative process and expand your imagination for you to consider new ideas that you might never had thought of before.

An animated picture. It is a response to generate a comic strip with targeted A I filters and neural filters in Photoshop. It also includes some text descriptions on A I. The last text reads, it's like having a really smart and helpful friend.

An animated picture. It is a response to generate a comic strip with targeted A I filters and neural filters in Photoshop. It also includes some text descriptions on A I. The last text reads, it's like having a really smart and helpful friend.

图6-18

第五十个文本图像提示基于作者创建的原始连环画,生成带有目标 AI 滤镜的黑白漫画,然后在 Photoshop 中使用一系列神经过滤器,作为法学硕士浓缩的上一段的变体

Fiftieth text-image prompt based on an original strip created by the author to generate a black-and-white comic strip with targeted AI filters followed by using a series of neural filters in Photoshop, as a variation on the previous paragraph condensed by an LLM

致谢

Acknowledgments

  • 致猫以及猫与光剑的相遇之地

  • To cats and the meeting place of cats and light sabres

  • 感谢我所遇到的老师,特别是凯西·索科尔 (Casey Sokol),他们向我介绍了我在本章中详细介绍的构建块,这些构建块专门用于开发音乐即兴创作

  • To the teachers that I’ve had and in particular Casey Sokol who introduced me to the building blocks that I detail in this chapter that were specifically applied to developing musical improvisations

  • 致世界各地的图形艺术家

  • To graphic artists everywhere

  • 对于法学硕士以及他们如何向我介绍书籍、研究和作者,我找不到任何其他方式

  • To LLMs and how they’ve introduced me to books, research, and authors I would not have found any other way